Re: Issue 0 WAS (Re: Oustanding issues WAS(IRe: Consensus? WAS(RFC 2863)

Krzysztof Halasa Tue, 15 Nov 2005 08:45:42 -0800

Stefan Rompf <[EMAIL PROTECTED]> writes:

> as described in net/core/dev.c:
>
>  * The @dev_base list is protected by @dev_base_lock and the rtln
>  * semaphore.
>  *
>  * Pure readers hold dev_base_lock for reading.
>  *
>  * Writers must hold the rtnl semaphore while they loop through the
>  * dev_base list, and hold dev_base_lock for writing when they do the
>  * actual updates.  This allows pure readers to access the list even
>  * while a writer is preparing to update it.


You are _not_ protected with dev_base_lock, only the line:
                dev->operstate_useroverride = OPERU_DORMANT;
is protected. Why don't you do atomic write instead?

dev->operstate_kernel can change (using netif_set_kernel_operstate())
at any moment, including from within IRQ handler. I assume you already
do atomic access. Still, this is racy as a whole. To see what I mean,
could you please outline (fill the empty functions) how would you use
netif_[gs]et_kernel_operstate():

// MODULE 1
// hardware driver changes carrier (called from IRQ handler)
hardware_driver_carrier_changed(int carrier_state)
{
}

// MODULE 2
// protocol driver changes protocol status based on negotations with peer
proto_driver_protocol_state_changed(int protocol_state)
{
}

// protocol driver listens to device status changes:
proto_driver_device_event(int state)
{
}

> rtnl is held by caller during linkwatch_run_queue(),
> operstate_useroverride is 
> only updated in process context, so this works.

What does calling from process context change here? That would be true
with BKL and, say, Linux 2.0 but it's no longer the case. rtnl lock
protects you from userland, though.

> This doesn't guarantee yet that no other netif_set_kernel_operstate() has 
> taken place before linkwatch_work runs - therefore, the change would need to 
> be written in an event queue what I don't do [yet]. But it guarantees that 
> there will be another event if that happens.

Event - yes. But the change will have already been lost.

Look, you can even have this:

CPU1:                                   CPU2:
                                netif_set_kernel_operstate(newstate) {
                                    oldstate = dev->operstate_kernel;
netif_set_kernel_operstate(newstate) {
    oldstate = dev->operstate_kernel;
        if (oldstate != newstate) {
[1]         dev->operstate_kernel = newstate;
                                        if (oldstate != newstate) {
[2]                                          dev->operstate_kernel = newstate;

            smp_wmb();
            linkwatch_fire_event(dev);
                                            smp_wmb();
                                                linkwatch_fire_event(dev);

How do you protect [1] and [2] from overwriting each other? You have to
use a lock, and this lock have to protect not only
netif_set_kernel_operstate() but the whole logic above it.

I don't say it's unacceptable, though I'd ask Jeff or David first (still
hope they'll say something co I'm ccing them).

You don't currently need such lock only because there is just one
module allowed to change a particular flag (__LINK_STATE_NOCARRIER etc.),
and this module is required to serialize all changes (which is usually
done naturally from device IRQ handler which is already serialized).
-- 
Krzysztof Halasa
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Issue 0 WAS (Re: Oustanding issues WAS(IRe: Consensus? WAS(RFC 2863)

Reply via email to