This is a bulk follow-up for *-diff2, *-diff3, and *-diff4.

Peter Memishian writes:
>  > General:
>  > 
>  >   Do you need to modify SNMP as well?  It looks to me like that should
>  >   be using LIFC_UNDER_IPMP, because it's an administrative utility.
> 
> Yeah, I was thinking about SNMP too.  It seems like we'd want to get that
> folded into the upstream net-snmp sourcebase once LIFC_UNDER_IPMP support
> is in OpenSolaris.  I'd be happy to take that on.

OK.

>  >   What else (possibly outside of ON) should use that flag?
> 
> I'd suspect some third-party IP monitoring/administration utilities will
> benefit from it, but I didn't have anything specific in mind.

It looks like something worth a heads-up of some sort, I think, though
I'm not sure where it'd do the most good.

>  > +++ IPMP-aware userland changes
>  > 
>  > usr/src/cmd/cmd-inet/sbin/dhcpagent/bound.c
>  > 
>  >   360: what happens if the IPMP test addresses handed out via DHCP are
>  >   on a separate subnet and thus need a separate set of probe targets
>  >   (i.e., "routers")?  One of the weird IPMP things we still do is that
>  >   we use the gateway addresses as probe targets, and this code will
>  >   limit the source of that information.
> 
> In that case, the administrator is expected to explicitly configure any
> probe targets -- e.g., by using dhcpinfo.  As you know, I regard the fact
> that the DHCP client directly configures routes at all as a bug.

It's an odd bit of functionality, then.  Maybe the right question for
me to ask is, "what would happen if you allowed pif_under_ipmp
interfaces to use default routes like the others?"

Would it fail?

>  > usr/src/cmd/cmd-inet/sbin/dhcpagent/interface.c
>  > 
>  >   740: logic to handle IFF_DUPLICATE and user manipulation of IFF_UP
>  >   seems to be gone.  (I know we talked about this in the past ... but
>  >   I'm not sure what the strategy is here.)
> 
> As you may recall, we decided to have dhcpagent no longer monitor changes
> to IFF_UP (this will be included in the PSARC commitment materials).  This
> change was necessary so that the DHCP client is not affected by an offline
> operation (since offline brings down test addresses).  The old IFF_DUPLICATE
> logic in interface.c is now the standard case, so that code block is no
> longer needed.

OK; I see it now.  It's just the special case that's no longer needed.

>  > usr/src/cmd/cmd-inet/usr.lib/in.ndpd/main.c
>  > 
>  >   151: could possibly check here that this doesn't blow off the end of
>  >   the buffer that holds optp->nd_opt_lla_hdw_addr.
> 
> By adding a `length' argument? 

Yes ... though on further inspection it looks like it wouldn't do any
good at all, as the callers all ignore buffer length and just write
whatever they want.

>  >   237: I don't think this works.  It's adding in sizeof (*ra) twice,
>  >   and I suspect it's sending out malformed packets.  The code should
>  >   instead look like this:
>  > 
>  >    optlen = add_opt_lla(pi, (struct nd_opt_lla *)pptr);
>  >    packetlen += optlen;
>  >    pptr += optlen;
> 
> Indeed, but I think it could be done as:
> 
>         packetlen += add_opt_lla(pi, (struct nd_opt_lla *)pptr); 
>         pptr = &packet[packetlen]; 

That won't work.  packet[] is defined as an array of uint64_t.  If you
want to do that, you'll need to do this for the second line:

        pptr = (char *)packet + packetlen;

>  > usr/src/cmd/cmd-inet/usr.sbin/snoop/snoop_capture.c
>  > 
>  >   142: does snoop support for the IPMP virtual interfaces arrive in
>  >   some separate project?  (I'm not clear on how IPMP rearchitecture
>  >   and Clearview observability relate.)
> 
> No, it arrives with this wad.  The check here is because this (highly
> questionable) loop is trying to choose a default datalink (not IP
> interface) to snoop on.  Since IPMP has no datalink, we skip IPMP IP
> interfaces.  Ultimately, snoop should use dladm_walk() to do this, not
> SIOCGLIFCONF.  But changing that seemed out-of-scope.

I'm still unclear on how that works.  It looks like this will just
find the member interfaces by default, and will not find an IPMP
interface.

Doesn't the Clearview observability node give you "/dev/net/ipmp0" to
snoop on, even if IPMP itself doesn't have a "datalink?"

>  > usr/src/lib/libinetutil/common/ifaddrlistx.c
>  > 
>  >   79: not that I care much, but why alloca here?  I thought most
>  >   avoided it unless it was needed or obviously safe.
> 
> Just following the same logic as ifaddrlist().  It could be a problem in a
> multithreaded program if there are lots and lots of IP addresses, though,
> so I'm willing to change both if you feel strongly.

I don't feel strongly about it; I was just surprised that alloca
showed up here.  Some others may be much more opposed to it, though.
(I know I've had to justify my use of it in the past.)

>  > usr/src/lib/libinetutil/common/libinetutil.h
[...]
>  >   98: this whole `ifaddrlist' thing seems unwise to me.  If we're
>  >   going to use BSD functions, we need to follow the BSD definitions of
>  >   those functions.  Otherwise, we need to come up with our own names.
>  >   Having a function with the same name as something in BSD but having
>  >   a different set of arguments and different functionality looks like
>  >   a recipe for disaster to me.
> 
> I agree, except that fork already happened when we added the second
> argument (address family) during IPv6 development.  Given that
> compatibility has already been compromised and this function is
> consolidation-private, it seemed OK to me to add the `flags' field.

Grumble; ok.

>  > usr/src/lib/libsocket/inet/interface_id.c
>  > 
>  >   147: while here, please fix this; bump the count up and (if
>  >   possible) add the redo logic.  I doubt anyone's going to touch this
>  >   stuff again soon.
> 
> The function is junk, as is if_nameindex().  I'd want to rewrite them
> rather than tweak them, but that's just too far out-of-scope for IPMP.
> I'll file a CR if you'd like though.

I wasn't suggesting a rewrite, but rather a minor sprucing as long as
you're in the neighborhood.  There's no reason these functions should
be allowed to fail just because plumbing activity is going on at the
same time the call is made.

Peter Memishian writes:
>  > +++ IPMP boot/DR support
>  > 
>  > usr/src/cmd/cmd-inet/usr.sbin/if_mpadm.c
>  > 
>  >   47-48,244-286: more duplicates.
> 
> Duplicates with what?  Unlike the ones that recently got added to libc,
> these call gettext(), which means the call sites don't have to get fouled
> up with that junk.

*sigh*

>  >   153-154,190-191: should these cases be errors?  I can understand
>  >   printing a message in this case, but returning with an error as
>  >   though something were wrong -- when the user's wishes have been
>  >   obeyed -- seems odd.
> 
> I agree it's odd, but that's the way the if_mpadm has always worked, and
> in theory administrative scripts could've been written with this in mind
> (e.g., if non-zero is returned, whatever was requested was not done), so I
> left it be.

OK.

>  >   159-163,218-222: this was changed from an actual error case to a
>  >   warning.  Why?
> 
> There wasn't a strong reason, but the reasoning was that those are just
> sanity checks and not needed for a successful offline/undo-offline
> operation.  I've never seen them trigger though, so if you feel strongly,
> I can change them to fatal errors.

I was more curious about how it was related to testing: whether you
saw an actual problem here.  If you're comfortable with the new
functionality, then I'm ok with it.

>  >   172: it's odd that this is tested twice -- once at line 162, and
>  >   then again here.
> 
> It was trying to catch two different cases (the first being that in.mpathd
> didn't do its job, and the second being that something cleared IFF_OFFLINE
> -- e.g., another application racing with SIOCGLIFFLAGS/SIOCSLIFFLAGS).
> But in truth the two cases aren't always distinguishable; I've removed the
> test at line 162.

OK.

>  >   175: nit: no need to do this (and risk the known problems with
>  >   203: similar issue here: no reason to bring up if it's already
>  >   there.
> 
> Based on the arguments passed to our calls to ifaddrlistx(), these
> addresses will always be IFF_UP and not IFF_UP, respectively (yes, it
> could change after teh call to ifaddrlistx(), but that change will not be
> reflected in ia_flags anyway).

Ah, ok; that seems fine.

>  >   203: too bad there's no way to remember which ones were
>  >   intentionally down before; this brings them all up.
> 
> Yes, though with the new design, this would only affect intentionally-down
> test addresses.

Yes ... I guess it's not an issue.

>  >   261,284: nit: equivalent to the more obvious 'strchr' in this case.
> 
> But what about efficiency? ;-)  Fixed.

strrchr is slower.  ;-}

>  > usr/src/cmd/rcm_daemon/common/ip_rcm.c
[...]
>  >   63-70: negative logic is hard to read (and wtf?).
> 
> Not my code, but this is a clever trick to allow lint to do format string
> checking (since otherwise the function call obscures the format string for
> lint).

If lint's argument checking doesn't work with the normal gettext
stuff, then I think that's a bug.  We should have the necessary
#define in a central place, not littered in source files.

>  >   2383: nit: could use "!grouped &&" to avoid the goto.
> 
> We'd also probably want to add that at line 2388, which starts to get
> ugly.  So I've left it as-is.

OK.

>  >   2437-2443: this is weird.
> 
> This code dates back to the old plumb-IPv6-on-demand logic, which no
> longer applies in recent Nevada since IPv6 loopback is always plumbed at
> boot and never gets unplumbed.  As such, I've removed it.

OK.

>  >   2541-2543: I don't understand why this "however" is true.  The code
>  >   already detects single-line files and sets stdif so that "netmask +
>  >   broadcast + up" is done, so why is an extra "up" needed here?
> 
> That logic only works if there was indeed something to bring up.  The case
> we're concerned about is a config which only has data addresses on the
> underlying interface (no test addresses).  In this case, all of the data
> addresses will have been filtered out by ifparse, so we won't perform any
> ifconfig operations at all, and we'd end up erroneously leaving the IP
> interface down.

Oh, ok.  So it's caused by the new architecture ... I get it.

>  >   I suspect that this would break some known Zones-related cases.
>  >   (E.g., one interface plumbed and up on the global zone only, another
>  >   plumbed but not up in the global zone, with shared-stack non-global
>  >   zones using logicals that second interface.)
> 
> If there's no test address (which is the case we're concerned about), then
> there will just be a 0.0.0.0 address instead (the data addresses are all
> off on the IPMP IP interface).  So I don't think bringing it up will have
> a big effect.

OK.  I was worried instead about the non-IPMP case where you have a
bunch of addresses up on the interface in separate zones, but the
zeroth instance is not up.

>  >   2553-2556: shouldn't this be done after all of the hostname files
>  >   are read?
> 
> When would it matter?  Note that any IPMP interfaces are already plumbed.

I suppose it doesn't matter.

>  > usr/src/cmd/svc/milestone/net-physical
[...]
>  >   158-159,165: easier to read as "for intf in $ipmp_list; do"
> 
> As above, I wanted to mirror the logic on lines 177 and 208.

Bleck.  I think it's wrong in all those places.

>  > usr/src/cmd/svc/shell/net_include.sh
>  > 
>  >   580-582: not sure if msglog likes all these open/closes.  I'd do
>  >   this as a single echo command, or use a () subshell.
> 
> In my testing, I didn't see any problems with it.  I tried the two
> alternatives you suggested and they were even uglier :-(

OK.

>  > +++ IPMP userland core
>  > 
>  > usr/src/cmd/cmd-inet/usr.lib/in.mpathd/mpd_probe.c
>  > 
>  >   406: nit: I think it might be necessary to read and discard until
>  >   msg_flags are zero after this event.
> 
> Why would there be more than one message marked with these flags?  I don't
> see any other code that does what you suggest with these flags.

Sorry; I was thinking of the MOREDATA flag from getmsg.  Not sure how
I misfired on this one.

>  > usr/src/cmd/cmd-inet/usr.lib/in.mpathd/mpd_tables.c
[...]
>  >   1384-1387: nit(?): it looks like you can't detach an interface with
>  >   a duplicate MAC address.  I guess that shouldn't happen, but it
>  >   seems like an odd corner case.
> 
> True.  I'm not sure if it's worth worrying about this case, but I think we
> could handle it by adding an explicit "offline reason" argument to
> phyint_offline(), and have phyint_offline() do the close even if
> pi_hwaddrdup is set as long as the offline reason is not "hwdup".

If that did happen ("local-mac-address?=false", maybe), how would the
user recover?

>  >   3367,3369,3380,3382,3396,33987: this could be simplified.  In every
>  >   case that you call ipmp_snap_add*, you always free on error.  This
>  >   means that the ipmp_snmp_add* functions could be designed to consume
>  >   the pointer unconditionally.
> 
> It would be less code, but I think the behavior would be surprising, and
> there could reasonably be cases in the future where the caller might want
> to do something else, like print a message using the contents of that
> argument prior to freeing it.

I think always consuming the argument rather than conditionally
consuming it is less surprising, but I don't feel too strongly about
it.

>  > usr/src/cmd/cmd-inet/usr.lib/in.mpathd/mpd_tables.h
[...]
>  >   167,169,221-222: nit: boolean_t used in some cases, bit flags in
>  >   others.
> 
> I'm not a huge fan of bitfields, but I'll use them when they're already
> there.

OK.

>  > usr/src/cmd/cmd-inet/usr.sbin/ifconfig/ifconfig.c
[...]
>  >   You're guaranteed that static locations of storage have distinct
>  >   addresses and can't be equal to any other pointers.
> 
> There are a number of established interfaces already assume -1 has a
> pointer representation -- e.g., MAP_FAILED with mmap(2) and SIG_ERR with
> signal(3C).  So, while a bit ugly, I think it is safe.

OK.

>  >   82: should IFF_IPMP be in this list?
> 
> No; IFF_IPMP doesn't indicate application control.

At least to a user, it almost does.  But I get the point.

>  >   4425: this should handle putting in.mpathd in a separate contract so
>  >   that it doesn't share fate with its invoker.
> 
> Getting SMF/contracts sorted for IPMP is a separate project (note that
> this code just moved from the old setifgroupname() implementation).

OK.

>  > usr/src/cmd/cmd-inet/usr.sbin/ifconfig/revarp.c
>  > 
>  >   89: could exclude other virtuals here, I guess.
> 
> Unrelated to my changes, but I believe that's already caught since vni
> will have IFF_NOARP set and IFF_LOOPBACK is checked explicitly.

OK; that makes sense.

Peter Memishian writes:
>  > usr/src/uts/common/inet/arp/arp.c
>  > 
>  >   158-162: it might be simpler to use arl->arl_ipmp_arl = arl instead
>  >   of NULL to signal "this is not in a group."  (But perhaps not worth
>  >   the effort at this point.)
> 
> I'd find that confusing; arl_ipmp_arl should always be an IPMP arl.

OK; no problem.

>  >   620: I don't think I understand this logic.  Why delete on
>  >   ace_xmit_arl match?  Shouldn't that just cause us to spin the
>  >   ar_ipmp_lookup_xmit_arl wheel instead to pick a new output?
>  > 
> 
> Because if there's an ace_xmit_arl match in the non-IPMP case, then
> ace_arl will also match (no change in behavior), and if there's an
> ace_xmit_arl match in the IPMP case, then this is an intentional purge of
> the ACE triggered by an unplumb of an IP interface in a group, which IP
> controls.  For instance, if this is an ACE_F_PERMANENT ACE, then after the
> ar_ce_delete_per_arl(), IP will ask ARP to create a new ACE for that IP
> address with a different ace_hw_addr.  There's no sense in having the old
> ACE_F_PERMANENT hanging around with a NULL ace_xmit_arl and a stale
> ace_hw_addr.

OK ... I guess what's unclear to me is when IP is in control of the
interface selection and when the ARP module does it.  It seems like
there are cases where the ARP module can do this on its own.

>  >   If ace_arl matches and ace_xmit_arl is non-NULL, then why not
>  >   promote the latter by doing this?
> 
> We don't want to kick the ACE out of the group, we want to torch the ACE,
> and let IP recreate it on another IP interface in the group if it wants.
> Also, FWIW, most ACE consumers assume ace_xmit_arl is always non-NULL.

Same issue as above.

>  >   631: this could use a different name, as it's really the handler for
>  >   an arl leaving the group.
> 
> Do you have something in mind?

Dunno ... ar_ce_group_leave, maybe?

>  >   640-645: the comment describes what the code does, but it's unclear
>  >   why it does this.  Shouldn't it be safe to leave all of these aces
>  >   alone?  They're already completely committed to the arl that's
>  >   leaving the group.  Why delete some of them?
> 
> I can't recall why it was written that way.  I've changed the code to
> leave them be; we'll see if anything happens :-)

It should be fine.  When the arl itself goes down, we'll nuke them.

>  >   654-686: if we just did 'ace->ace_arl = arl;', would that be enough?
> 
> I don't think it's appropriate to always keep these ACEs -- e.g., if it
> was an ACE for a proxy ARP entry on the IPMP interface, we don't want that
> ACE to linger behind on the removed interface.  (Note that all we're doing
> here is ar_ce_delete(); the comments make the code seem more daunting.)

What bugs me is that we're removing entries (possibly "permanent"
ones) that are on the upper-level IPMP interface, and we're removing
them only because the currently selected lower-level interface went
away.

If the entries are supposed to be there (if they are, say, proxy ARP
entries configured by an administrator), what will bring them back?

>  >   944-955: this leaves an existing timing hole open.  If we move an
>  >   address from one arl to another, and we get one of our old ARP
>  >   messages echoed back after the switch, we'll misidentify it as being
>  >   a failure.  It'd be a bit better if we could check whether the
>  >   source hardware address matches any of ours and just ignore the
>  >   message if so.
> 
> So you're suggesting:
> 
>          if (ace->ace_flags & ACE_F_MYADDR) { 
>                  arl_t *hw_arl = as->as_arl_head; 
>                  arlphy_t *ap; 
> 
>                  for (; hw_arl != NULL; hw_arl = hw_arl->arl_next) { 
>                          ap = hw_arl->arl_phy; 
>                          if (ap != NULL && ap->ap_hw_addrlen == hlen && 
>                              bcmp(ap->ap_hw_addr, src_haddr, hlen) == 0) 
>                                  return (AR_LOOPBACK); 
>                  } 
>          } 
> 
> ... right?  If not, please describe.

Something like that.  If you wanted, you could limit yourself to the
arls within the same IPMP group.  You should never see a message for
one of your own addresses reflected back on some separate group --
unless the system is badly misconfigured.

>  >   992: as above, I don't think that updating unresolved entries for
>  >   other groups makes sense.
> 
> My rationale was to preserve parity with the non-IPMP case.  That is, the
> original code updated unresolved ACEs on different ARLs in the non-IPMP
> case, and the equivalent behavior in the IPMP case is to update unresolved
> ACEs on ARLs in different groups.  Could you elaborate on how you see the
> two cases as requiring different treatment?

It did that only because it was ignorant of IPMP, and the sketchy
functionality of resolving ACEs on a separate ARL was there so that
the IPMP code would kinda work.

Now that ARP knows about IPMP, I don't think we should need to be
permissive about this: either the ACE is in the same group (and thus
any ARL in that group will do), or it's in a different group (and no
match exists).

This is a long-standing bug.  We shouldn't be receiving an ARP message
on one interface and turning around to update the ARP cache for a
completely different network!

>  >   1163-1176: the suggested changes to ar_ce_delete_per_arl would make
>  >   this extra walk unnecessary.
> 
> Perhaps, but I think you're proposing a different breakdown for the way
> ARP/IP manage ACEs (especially ACE_F_PERMANENT ones), and it's not clear
> to me that there's an issue with the breakdown I've implemented.

OK.

>  >   2224: I think we assume elsewhere that the string must be less than
>  >   LIFNAMSIZ, don't we?  I'm not sure this is needed anyway, as the
>  >   callee just uses this for strcmp, and that's bounded by the arl_t's
>  >   actual name.  We can't run off the end.
> 
> I've changed it to LIFNAMSIZ-1 to handle your arp.h comment.  I prefer
> explicitly NUL-terminating the string since ar_ipmp_lookup() shouldn't
> know about the implementation of ar_ll_lookup_by_name().

They're static functions in the same .c file that share private data
structures.  I think they're pretty well acquainted already.

I prefer it without the extra NUL setting ... but I won't press hard.

>  >   3086: why did 'arl' get removed here?  It seems like it's handy as
>  >   we dereference ace->ace_arl multiple times.
> 
> My concern was that it'd be easy to use `arl' and not realize you were
> using ace->ace_arl where you really wanted ace->ace_xmit_arl.  I could
> reintroduce it and name it ace_arl, but it was unclear to me that it
> was useful to save five characters (ace->ace_arl vs. ace_arl).

It's not a big deal, but it'd be nice to see.

>  > usr/src/cmd/cmd-inet/usr.sbin/ipmpstat/ipmpstat.c
>  >   194,1288-1304: it seems a shame that the libc version of this
>  >   function isn't good enough.
> 
> Indeed, but it isn't (need gettext()).

That's ugly.

>  >   551,733,747,761,778,793,797,802,817,821,826,840,855: should this
>  >   return "?" in buf?  (Unclear on '--' versus '?' usage.)
> 
> Good point.  Basically, we want to print "?" when the value could not be
> determined, and "--" when the value is simply not set.  Added a new
> sfunc_nvwarn() routine which properly sets `buf' to "?" on nvlist errors.

OK.

>  >   869-870: not sure what this means.  (I think it refers to sysevents
>  >   as 'GPECs' ... but I'm not sure what the comment is trying to say.)
> 
> General Purpose Event Channels -- PSARC/2002/321.  I'll add the PSARC case
> reference for clarity.

It's not just the reference, but why it matters here.

>  >   986: why do we display v4 targets if both are disabled?  The comment
>  >   above doesn't seem to explain what's going on here.
> 
> Comment has been expanded.  Basically, we want to tell the admin that
> the mode is "DISABLED" so we just pick one of the two.

OK.

>  >   1029: ofmt_head could be NULL here; does this need a die() check
>  >   like line 1066?
> 
> How could it be NULL?  It'd require the first field to have an f_width >
> IPMPSTAT_NCOL, which would mean deeper problems are afoot.

For some reason, I thought this was using winsize.ws_col.

>  >   1282: why both gettext here and a .xcl file?
> 
> Something has to translate "<field>" to its localized value.  I could do
> it as part of the gettext() on 1284, but that led to an ugly line wrap.

The part that confuses me is why you need to do gettext() at all.  I
*thought* you were using an ipmpstat.xcl file with this program so
that you didn't have to wrap all the strings with gettext().

So ... why do these strings need to be wrapped, but the others don't?

>  > usr/src/lib/libipmp/common/ipmp_admin.c
>  > 
>  >   67: nit: should probably set errno only if result.me_mpathd_error ==
>  >   IPMP_FAILURE.
> 
> When would it matter?  The value of errno should only be inspected if
> me_mpathd_error is IPMP_FAILURE.

True ... I'm just not wild about functions that modify errno when they
don't fail.  You're not supposed to do that.

>  > usr/src/lib/libipmp/common/ipmp_query.c
>  > 
>  >   172,173,199,200: is it possible for these to be non-NULL at this
>  >   point?
> 
> I think they'd still contain the pointers from in.mpathd's address space,
> since we don't bother clearing them out as part of sending them (e.g., in
> send_groupinfo()).  These are low-level internal utility routines that can
> only be used by libipmp, so I think that's OK.

OK.

>  > usr/src/lib/libipmp/common/ipmp_query.h
>  > 
>  >   136,153,154: won't it_name just duplicate what's already in if_name?
> 
> Yes, but it makes the targinfo self-describing, which simplifies the code
> in ipmpstat that uses it (otherwise, the sfunc_targ_*() functions would
> instead need to take another structure type that had both if_name and a
> targinfop so that sfunc_targ_ifname() could work).

OK.

-- 
James Carlson, Solaris Networking              <james.d.carlson at sun.com>
Sun Microsystems / 35 Network Drive        71.232W   Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757   42.496N   Fax +1 781 442 1677

Reply via email to