On 10/07/08 21:12, Garrett D'Amore wrote:

> The interfaces are marked Committed.  I have some concerns with this, as 
> I read the project details.  One of the areas that really concerns me is 
> the fact that drivers have to deal with a situation where the total 
> number of interrupts is reduced.   (*Increasing* interrupts seems free 
> from the issues I am concerned with.)

You're right, we have very limited experimentation with these interfaces.
And I should not be casting the callback semantics into stone.  So I will
change the interfaces to Uncommitted.

> I'm interested to know how this is used with real hardware.
> 
> For example, one potential use of this facility is to support multiple 
> receive rings.  However, now the problem is that there is a window of 
> time where an extra ring may be "orphaned".  What happens to those 
> packets that are received there?  (What happens to the interrupt that 
> the hardware issues, for that matter?)

I've been working with the Neptune driver developers, to modify it to
use these interfaces.  So let's use that real hardware example.

The attach(9F) routine for Neptune sets up interrupt handling with an
algorithm like this:

        1) nintrs <-- ddi_intr_get_nintrs();
        2) nactual <-- ddi_intr_alloc(nintrs);
        3) nrequired <-- nxge_ldgv_init(nactual);
        4) for (i = [0..nrequired]) {
                inthandler = (lookup handler for inum i)
                ddi_intr_add_handler(i, inthandler);
                ddi_intr_enable(i);
           }

The function nxge_ldgv_init() takes a number of interrupts available
as an input, and decides: 1) how many to actually use (nrequired),
and 2) what composition of handlers to use on each vector.  (Some
are a 1-to-1 handler, others multiplex many conditions on one vector.)
I believe that function also does hardware reprogramming of some sort
to match what it decided.

To modify this driver to use IRM, the enhancements are:

- First insert a step 0 in the attach(9F) procedure, to register
   an IRM callback handler prior to any other interrupt setup.

- Then implement a callback handler which basically tears down
   all handlers, adjust allocations in response to the callback,
   and then restarts interrupts in the same way they were first
   setup in attach.  With pseudo code like this:

       nxge_cbfunc(dip, action, cbarg, arg1, arg2)
       {
          nxgep = (nxge_t *)arg1;

          case DDI_CB_INTR_ADD:
             count = (int)cbarg;
             nxge_quiesce(dip);
             for (i = 0; i < nxgep->nactual; i++) {
                ddi_intr_disable(nxgep->hdls[i]);
                ddi_intr_remove_handler(nxgep->hdls[i]);
             }
             ddi_intr_alloc(dip, &nxgep->hdls[nxgep->nactual], count, &nactual);
             nxgep->nactual += nactual;
             /**** Repeat steps 3 & 4 from attach routine ****/
             nxge_unquiesce(dip);
             return (SUCCESS);

          case DDI_CB_INTR_REMOVE:
             count = (int)cbarg;
             nxge_quiesce(dip);
             for (i = 0; i < nxgep->nactual; i++) {
                ddi_intr_disable(nxgep->hdls[i]);
                ddi_intr_remove_handler(nxgep->hdls[i]);
             }
             for (i = 0; i < count; i++) {
                ddi_intr_free(nxgep->hdls[nxgep->nactual - 1]);
                nxgep->nactual--;
             }
             /**** Repeat steps 3 & 4 from attach routine ****/
             nxge_unquiesce(dip);
             return (SUCCESS);
       }

In general, drivers should already have an ability to quiesce their
hardware.  They should have this for other reasons, like OS quiesce
during certain DR operations.

And, drivers that can utilize more interrupt vectors must already
have logic to adapt themselves to arbitrary numbers of vectors.
As is done by that nxge_ldgv_init() function in Neptune.

The two bits of logic put together and the driver should be able to
quiesce, re-calculate, and unquiesce as necessary during a callback.
Things should not get orphaned.  And there is no need for a driver
to try and create a policy to lock in a specific number of interrupt
vectors.  To get and use more interrupt vectors, they must have the
flexibility to dynamically adapt in either direction.  Either to
scale up and use more vectors, or scale down and use less.

> How will this feature interact with Crossbow resource allocation?  Do we 
> have any examples of the two working together yet?

Our team has met with the Crossbow team.  Their basic concern was that
we must not regress the performance of certain NIC drivers.  The drivers
they rely upon have existing workaround mechanisms to get more interrupt
vectors.  My project preserves those workarounds, so they can still get
more interrupts before they are converted to use my new interfaces.

-- 
Scott

Reply via email to