Re: [networking-discuss] code review request for public DLPI library

James Carlson Mon, 09 Oct 2006 06:48:42 -0700

Sagun Shakya writes:
> >> http://zhadum.east.sun.com/export/build1/ws/sshakya/clearview-libdlpi-onnv/webrev/
> >>     
> > ifconfig/ifconfig.c
[...]
> >   55: I'm curious: why was 'ifname' changed to 'linkname'?  It seems
> >   like it's the same thing, and the latter requires a lot of
> >   rototilling.
> >   
> This was changed to represent the correct name as a datalink name.


I know ... but there doesn't seem to have been anything especially
wrong with the previous variable name, and changing it seems to cause
a lot of change in the code.  It'll be annoying for maintainers.

I suppose I don't care that much, though.

> ifconfig/revarp.c
[...]
> >   169: I'd recommend using something that isn't tied to system time of
> >   day, like gethrtime().  (And you can then do work directly in
> >   milliseconds rather than multiplying seconds by 1000 for the call to
> >   dlpi_recv.)
> >   
> Since gethrtime() returns time in nanosecondes wouldn't this require 
> further conversion from nanoseconds to milliseconds for dlpi_recv()

Yes.

> and 
> if the rarp_timeout is to be calculated in seconds than a conversion 
> from nanoseconds to seconds?

I'm not talking specifically here about the number of conversions (I
think it's pretty much a wash).  There are two problems here:

  - time drift
  - precision

If you keep time in seconds, then you have to multiply by 1000 to get
milliseconds, and that's much less precise than keeping time in
nanoseconds and dividing down.

The drift issue is _much_ more serious.  The normal time of day clock
on the system is adjusted by the time-setting commands.  This means
that successive calls to gettimeofday() can return values that are
well above _or_ below the previous call.  It's not reliable for this
sort of usage.

It's reliable if you want to know the current time of day so that you
can print this value out somewhere, but it's not useful if you're
trying to measure intervals.

> >   170: why did waittime change to waitsec?
> >   
> This was changed to reflect variable waittime was in seconds.

It seems to me like an unclear change.

> > snoop/snoop_capture.c
[...]
> Instead of the comment I went with the approach of copying the link name 
> before  nuking the handle. I don't think passing the linkname in 
> pr_errdlpi() in itself is any problem.

OK.

> >   311: this test isn't needed.  The only way to get here when quitting
> >   is false is if retval is not success.
> >   
> Couldn't we get here if it breaks out of the loop in line 299 due to 
> retval is not success but quitting is false? Or I'm not being able to 
> follow the logic ?

Ah, I think the real issue here is that if you stop on error *and* the
'quitting' flag is set, then you don't want to print out the error.

So, the easiest (and clearest) thing to do is to combine these two
conditions with &&.

> > dladm/dladm.c
> >
> >   1592-1593: it looks to me like the original code checked for "aggr"
> >   first on purpose.  Maybe.  :-/
> >   
> This change should done a strcmp on 'drv' instead of 'dev' which would 
> need the dlpi_parselink() called first.
> 
> 1592                 if (dlpi_parselink(dev, drv, &index) != DLPI_SUCCESS || 
> 1593                     strcmp(drv, "aggr") == 0 || index >= 1000 || 

OK.

> > iscsitgtd/util_ifname.c
[...]
> >   347: this looks broken to me.  The original code did a malloc for
> >   *addr.  That's why the argument was a uchar_t **, rather than just a
> >   uchar_t *.  This new code doesn't appear capable of doing any memory
> >   allocation.  Did you test iSCSI?
> >
> >   
> The iSCSI testsuite was ran against  bfu bits. I will see why this was 
> not caught.

OK.

> > libdlpi/Makefile
> >
> >   34: just curious: what 64 bit user space objects link against this
> >   library?
> >
> There are no known ones.

OK, then, no issue.

> > common/libdlpi.c
> >   218: no guarantee that bufc (which becomes ctlbuf) is properly
> >   aligned; this may drop core on platforms (such as SPARC) that
> >   require alignment.
> >   
> Using the -misalign flag at compile time would not be a solution to 
> this. How should this be solved?

long bufc[DLMAXBUF / sizeof (long)];
long bufd[DLMAXBUF / sizeof (long)];

> >   238,525: I'd recommend against gettimeofday.  It drifts and isn't
> >   reliable enough for an application like this.  gethrtime() is
> >   better.
> >   
> I'll use gethrtime and here a conversion from nanoseconde to 
> milliseconds will be needed?

Yes.

> This test is for the case when raw message are requested. line 933.  It 
> was not intentional to loop forever when both ctlbuf and databuf are null.
> This will be fixed.
> >   372: what does the stat() call here do that the open() does not?
> >   
> I had left this as from the previous (consolidation private) version of 
> libdlpi.

Should it be removed?

> >   394: as a design rule, having the user allocate the data structure
> >   to be used within a library is dangerous.  The dlpi_info_t structure
> >   should be allocated by a library function, or this interface should
> >   require the user to pass in sizeof(dlpi_info_t) in case things
> >   change in the future.
> >   
> Curious isn't the opt field in some sense to serve the purpose of the 
> sizeof(dlpi_info_t) since if
> the sturcture size change can be detected from the different opt field?

So, the plan is to force users to pass something other than zero as
the third argument if the dlpi_info_t structure ever changes?

I guess that'll work, but it seems harder to use than necessary.  I
hope the structure just never needs to change.

> >   411: why is this needed?  dl_info_req_t has a fixed known size.
> >   
> This is to keep the dlpi_primsizes table consistent with all the 
> primitives used to have the corresponding size.

It seems a bit more complex than needed, but ok.

> >   431: should do more validation here.  dl_qos_offset had better not
> >   be less than sizeof (*infoackp).  dl_qos_length should not be
> >   greater than the size of the buffer returned.  Incidentally, given
> >   the way dlpi_msg_common is defined, how do you get at the returned
> >   message length for checking?]
> >   
> I have added the validation. If I understood your question properly do 
> you mean how do I check the size of the returned reply primitive. In 
> that case expecting does the check.

No, that doesn't work.

What happens if dl_qos_offset+dl_qos_length is greater than the actual
size of the returned message?  What can check that?

> >   432: what guarantees that &infop->di_qos_sel is big enough to hold
> >   all infoackp->dl_qos_length bytes?
> >   
> Is it safe after adding a check as below line 432 and dlpi_info_t 
> structure defines di_qos_sel as a dl_qos_cl_sel1_t?

I'm not sure what you mean.  I think you should be checking that
dl_qos_length isn't too large for the output buffer.

> >   438-439,450-451,472-473: needs more checking.
> >   
> Adding a check for offset as well
>   infoackp->dl_addr_offset  > sizeof(dl_info_ack_t)

It should be >=, not just >.

> Added check for :
> 432: (infoackp->dl_qos_length < sizeof(dl_qos_cl_sel1_t)))

I don't understand that.  "<" doesn't seem to make sense here.

> 438-439: (infoackp->dl_qos_range_length < sizeof(dl_qos_cl_range1_t))) }

Same.  Why wouldn't that be "<="?

> 450-451: (infoackp->dl_addr_length < DLPI_PHYSADDR_MAX)

"<="?

> 472-473: see comment for line 464

I'm confused by that.  I'm asking what _checks_ the value.

> >   440: what guarantees that &infop->di_qos_range is big enough to hold
> >   all infoackp->dl_qos_range_length bytes?
> >   
> Same question as for comment of line 432.

OK ... ?

> >   457,460: why is this done only in the case where there's an address?
> >   Why not set this flag when you're computing the abs value?
> >   
> Flag is set when computing the abs value

(Code changed?)

> >   464,466,475: same questions about destination buffer size: how do we
> >   know that the buffer can hold what we're copying?  What if the
> >   message from the driver is corrupt?
> >   
> For the destination buffer since infop->di_physaddr is an array of 
> DLPI_PHYSADDR_MAX isn't it safe to assume the size?

No.

>   I've added a check (takes care of the later comment as well):
> 
>        if (infop->di_physaddrlen > DLPI_PHYSADDR_MAX ||
>             infop->di_bcastaddrlen != infop->di_physaddrlen
> Do you suggest a check of offset  and a check of dl_sap_length?

Yes.

> >   470: if you later assume this to be so, then why not check it here?
> >   If dl_brdcst_addr_length is non-zero, then check that it's also
> >   equal to infop->di_physaddrlen.  If it's not, then something's
> >   wrong.
> >
> >   
> Would you suggest if it's not equal to return  DL_BADADDR and same for 
> above?

Yes.

> >   515-518: why is this needed?  Why not just use "dl_attach_req_t
> >   attachreq;" and "dl_ok_ack_t ack;" on the stack, rather than calling
> >   these extra functions and using alloca?  There doesn't seem to be a
> >   benefit in this case.
> >   
> Not needed except to keep it consistent as the others. But in that case 
> dlpi_passive(), dlpi_unbind() could be done the same.

I see.

> >   541: if 'prim' is invalid, this accesses memory off the end of the
> >   array.  (Not that I care for a static function always called with
> >   constants, but ...)
> >   
> I've changed this to:
> return (i < nprimsizes ? dlpi_primsizes[i].dp_primsz : 0);

For what it's worth, I would have used sizeof (t_uscalar_t) instead of
0 there.  The caller is probably allocating memory as a result of this
call, and the minimum primitive length is actually the size of the
dl_primitive field.

> >   579: will access one off the beginning of the array if "" is passed
> >   in for provider.
> >   
> Added a check of:
>  if (provlen = 0 || provlen >= DLPI_LINKNAME_MAX)
>                 return (DLPI_ELINKNAMEINVAL);

"=="?

> >   668: should this function do something about dip->dli_sap?
> >   
> Probably but I'm not sure what can be done since setting it to zero also 
> has a meaning.

I'd prefer seeing it returned to its initial value, whatever that
might be.  (If that's zero, then that's fine, even if zero has a
meaning.  Perhaps that might be a reason to revisit the chosen initial
value ...)

> >   784-785: should do more checking here.
> >   
> Added a check for dl_addr_length:
>         if ((physackp->dl_addr_offset != 0) &&
>             (physackp->dl_addr_length != 0) &&
>             (physackp->dl_addr_length <= DLPI_PHYSADDR_MAX)) {
> 
>    adding a check for offset as well.
> physackp->dl_addr_offset > DL_PHYS_ADDR_ACK_SIZE;

That should be ">=".

> >   953-954: should check more.
> >   
> Added a check for:
> (udatap->dl_src_addr_length <= DLPI_PHYSADDR_MAX))
> Do you suggest to add a check for the offset as well?
> udatap->dl_src_addr_offset >= DL_UNITDATA_REQ_SIZE;

Yes.  But I think that should be DL_UNITDATA_IND_SIZE for input.

> >   958,962: how do we know saddrp is big enough?
> >
> >   975,979: same for recvp->dri_destaddr.
> >   
> The approach was, as will be in the manpage for dlpi_recv(), the caller 
> needs to ensure saddrp is atleast DLPI_PHYSADDR_MAX. 
> As in other places will add a check for saddrp.
> For recvp->dri_destaddr, the structure member is defined to be a an 
> array of DLPI_PHYSADDR_MAX. Will that ensure the size otherwise will add
>   a check of sizeof(recvp->dri_destaddr) >= DLPI_PHYSADDR_MAX.

For saddrp, I guess that's sufficient.  It means, though, that if we
ever have to change DLPI_PHYSADDR_MAX because of some new datalink
layer (IB++?), we'll be in a tough place.

For dri_destaddr, the problem is that dri_destaddrlen is just
dl_dest_addr_length minus dli_saplen.  There seems to be no guarantee
that this is actually in the range 0..DLPI_PHYSADDR_MAX.

> >   983: I'd suggest passing the dl_group_address value back to the
> >   caller unmolested.  Values other than 0 and 1 may be significant.
> >   
> Isn't the purpose of dri_isunicast just to indicate if it  is a unicast 
> address or  multicast/broadcast address?

Yes, but the actual value in the structure is t_uscalar_t, and the
comment says "one if multicast/broadcast."  It doesn't say it's a
boolean, or that other values will never be needed in the future.

My suggestion is to rename this field to dri_group_address, and just
copy the value in.

> >   1033-1038: seems a bit misleading; consider reformatting.
> >   
> I 've reformatted this to:
> 
>          if (retval != DLPI_SUCCESS) { 
>                  dip->dli_mod_pushed = 0;
>                  return (retval);
>        } else {
>                dip->dli_fd = fd;
>        }

No, I was suggesting this:

        if (retval != DLPI_SUCCESS) {
                dip->dli_mod_pushed = 0;
                return (retval);
        }

        dip->dli_fd = fd;

I find the 'else' clause to be very misleading in this case.  'Else'
makes sense when both arms have return or goto statements, or when
neither one does.  If only one arm has return/goto, then it's much
less clear to the reader when the 'else' is actually encountered, and
what paths lead to falling through to the code below.

> >   1087: what?
> >   
> As mentioned above in comment for line 1026, this code was left as the 
> previous code, except the variable name changes, since it will be 
> removed by the IP Tunnel work.
> The code here for example for ifconfig ip.tun0  plumb does is
> 
> line 1047:dli_mod_pushed++ is done so that the pushing of the last 
> module is done in i_dlpi_style2_open.
> 
> Line 1087 is to bring the dli_mod_pushed to the actual value (bumped in 
> line 1047) so the ioctl call in line 1094 is done.
> 
> I can add a comment and put a note that this code is temporary and will 
> be simplified with the IP tunnel work.

OK.

> >   1254: the design seems somewhat fragile here.  It'll discard
> >   unexpected asynchronous messages rather than dispatching them, which
> >   may well be harmful for things like DL_NOTIFY_IND.  (No, I don't
> >   have a good solution that doesn't involve significant complexity for
> >   the application.  :-<)
> Sorry I don't have a better solution. If DL_NOTIFY_IND is to be added, 
> the notification will have to probably be more complicated. Like the api 
> dlpi_notify() could be added a call back function and callback argument 
> mechanism as proposed in the PSARC fasttrack?

Yes, that's where I think it would be headed.

At least a comment here explaining the risk would be good.  The man
page should also explain what happens.

> >   1291-1294: what does this represent?  Can it ever happen?
> >   
> Probably not, it is a default condition if anything unexpected happens.

I don't understand.  The expecting() function will _prevent_ this from
happening.  That makes it an assertion, not an "unexpected" case.

If it doesn't work, then it's a design problem in this library.

> >   1359: this string isn't reachable; could just be NULL.
> >   
> Would this be just a matter of preference if left with as the error text?

It's clearer to me if it's left as NULL -- since readers will be left
wondering how and why that string is ever used -- and it seems unfair
to force translators to translate a string we never use, but if you
really need it, I suppose it can be kept.

> >   1407: why is Style 1 an error?  (Maybe error text is confusing.)
> >   
> This is an error returned when a STYLE2 is expected. Would 
> DLPI_ENOTSTYLE2 with error test of "Not Style 2 DLPI device" easy to 
> understand?

I'd expect "Style 2 Node Name Reports Style 1" or some such.

> > common/libdlpi.h
> >   
> >   77: why not just have DLPI_SUCCESS = 10000 and remove DLPI_ERRBASE?
> >   
> In dlpi_strerror() wanted to do the calculation 'err - DLPI_ERRBASE' 
> instead of DLPI_SUCCESS.  I've changed ithave it to be:
> 
>   enum {
>             DLPI_ERRBASE = 10000,           /* starting libdlpi error code */
>             DLPI_SUCCESS = DLPI_ERRBASE,    /* libdlpi success */

Note that if you do this, you'll need to remove that first
libdlpi_errlist[] entry.

I still don't quite see the point of DLPI_ERRBASE, though.
DLPI_SUCCESS looks like the start of the list to me.

> > common/libdlpi_impl.h
> >
> >   74: why not milliseconds (which is what poll(2) uses)?
> >   
> Except for dlpi_recv(), the timeout values for other APIs  is in 
> seconds  therefore I chose to  set it to seconds.

OK.

> >   79: is this actually t_uscalar_t?
> >   
> Are you asking because dl_provider_style is t_uscalar_t?

Yes.

> >   81: this is used elsewhere as uint_t; inconsistent.
> >   
> I will fix this to be a uint16_t elsewhere. The reason for that is 
> uint16_t is what the drivers use  for the sap value. 

No, it's what Ethernet drivers happen to use; it's not a DLPI
constraint.  There's no guarantee that the SAP length is exactly 2.
In fact, for PPP on Solaris, it's 4, so I think this library will fail
if anyone tries to talk to /dev/sppp.  (It probably should not be 4,
but that's another story.)

If I were writing the library, I would support SAP lengths of 0, 1, 2,
and 4.

-- 
James Carlson, KISS Network                    <[EMAIL PROTECTED]>
Sun Microsystems / 1 Network Drive         71.232W   Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757   42.496N   Fax +1 781 442 1677
_______________________________________________
networking-discuss mailing list
[email protected]

Re: [networking-discuss] code review request for public DLPI library

Reply via email to