Re: Linux 2.2.19pre2

2000-12-24 Thread Andrea Arcangeli

On Sun, Dec 24, 2000 at 11:23:33AM +1100, Andrew Morton wrote:
> ack.

This patch against 2.2.19pre3 should fix all races. (note that wait->flags
doesn't need to be initialized in the critical section in test1X too)


ftp://ftp.kernel.org/pub/linux/kernel/people/andrea/patches/v2.2/2.2.19pre3/wake-one-3

Comments?

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-24 Thread Andrea Arcangeli

On Sun, Dec 24, 2000 at 04:17:10PM +1100, Andrew Morton wrote:
> I was talking about a different scenario:
> 
> add_wait_queue_exclusive(>wait_for_request, );
> for (;;) {
> __set_current_state(TASK_UNINTERRUPTIBLE);
>   /* WINDOW */
> spin_lock_irq(_request_lock);
> rq = get_request(q, rw);
> spin_unlock_irq(_request_lock);
> if (rq)
> break;
> generic_unplug_device(q);
> schedule();
> }
> remove_wait_queue(>wait_for_request, );
> 
> Suppose there are two tasks sleeping in the schedule().
> 
> A wakeup comes.  One task wakes.  It loops aound and reaches
> the window.  At this point in time, another wakeup gets sent
> to the waitqueue. It gets directed to the task which just
> woke up![..]

Ok, this is a very minor window compared to the current one, but yes, that
could happen too in test4.

> I assume this is because this waitqueue gets lots of wakeups sent to it.

It only gets the strictly necessary number of wakeups.

> Linus suggested at one point that we clear the waitqueue's
> WQ_FLAG_EXCLUSIVE bit when we wake it up, [..]

.. and then set it after checking if a new request is available, just
before schedule(). That would avoid the above race (and the one
I mentioned in previous email) but it doesn't address the lost wakeups
for example when setting USE_RW_WAIT_QUEUE_SPINLOCK to 1.

Considering wakeups only the ones that moves the task to the runqueue will get
rid of the races all together and it looks right conceptually so I prefer it.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-24 Thread Andrea Arcangeli

On Sun, Dec 24, 2000 at 11:23:33AM +1100, Andrew Morton wrote:
 ack.

This patch against 2.2.19pre3 should fix all races. (note that wait-flags
doesn't need to be initialized in the critical section in test1X too)


ftp://ftp.kernel.org/pub/linux/kernel/people/andrea/patches/v2.2/2.2.19pre3/wake-one-3

Comments?

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-23 Thread Andrew Morton

Andrea Arcangeli wrote:
> 
> On Sun, Dec 24, 2000 at 01:28:59PM +1100, Andrew Morton wrote:
> > This could happen with the old scheme where exclusiveness
> > was stored in the task, not the waitqueue.
> >
> > >From test4:
> >
> > for (;;) {
> > __set_current_state(TASK_UNINTERRUPTIBLE | TASK_EXCLUSIVE);
> >   /* WINDOW */
> > spin_lock_irq(_request_lock);
> > rq = get_request(q, rw);
> > spin_unlock_irq(_request_lock);
> > if (rq)
> > break;
> > generic_unplug_device(q);
> > schedule();
> > }
> >
> > As soon as we set TASK_UNINTERRUPTIBLE|TASK_EXCLUSIVE, this
> > task becomes visible to __wake_up_common and can soak
> > up a second wakeup. [..]
> 
> I disagree, none wakeup could be lost in test4 in the above section of code
> (besides the __set_current_state that should be set_current_state ;) and that's
> not an issue for both x86 and alpha where spin_lock is a full memory barrier).
> 
> Let's examine what happens in test4:
> 
> [ snip ]

I was talking about a different scenario:

add_wait_queue_exclusive(>wait_for_request, );
for (;;) {
__set_current_state(TASK_UNINTERRUPTIBLE);
/* WINDOW */
spin_lock_irq(_request_lock);
rq = get_request(q, rw);
spin_unlock_irq(_request_lock);
if (rq)
break;
generic_unplug_device(q);
schedule();
}
remove_wait_queue(>wait_for_request, );

Suppose there are two tasks sleeping in the schedule().

A wakeup comes.  One task wakes.  It loops aound and reaches
the window.  At this point in time, another wakeup gets sent
to the waitqueue. It gets directed to the task which just
woke up!  It should have been directed to a sleeping task,
not the current one which just *looks* like it's sleeping,
because it's in state TASK_UNINTERRUPTIBLE.

This can happen in test4.

> This race is been introduced in test1X because there the task keeps asking for
> an exclusive wakeup even after getting a wakeup: until it has the time to
> cleanup and unregister explicitly.

No, I think it's the same in test4 and test1X.  In current kernels
it's still the case that the woken task gets atomically switched into
state TASK_RUNNING, and then becomes "hidden" from the wakeup code.

The problem is, the task bogusly sets itself back into
TASK_UNINTERRUPTIBLE for a very short period and becomes
eligible for another wakeup.

There's not much which ll_rw_blk.c can do about this.

> And btw, 2.2.19pre3 has the same race, while the alternate wake-one
> implementation in 2.2.18aa2 doesn't have it (for the same reasons).

The above scenario can happen in all kernels.

> And /* WINDOW */ is not the only window for such race: the window is the whole
> section where the task is registered in the waitqueue in exclusive mode:
> 
> add_wait_queue_exclusive
> /* all is WINDOW here */
> remove_wait_queue
> 
> Until remove_wait_queue runs explicitly in the task context any second wakeup
> will keep waking up only such task (so in turn it will be lost). So it should
> trigger very easily under high load since two wakeups will happen much faster
> compared to the task that needs to be rescheduled in order run
> remove_wait_queue and cleanup.
> 
> Any deadlocks in test1X could be very well explained by this race condition.

Yes, but I haven't seen a "stuck in D state" report for a while.
I assume this is because this waitqueue gets lots of wakeups sent to it.

The ones in networking I expect are protected by other locking
which serialises the wakeups.

The one in the x86 semaphores avoids it by sending an extra wakeup
to the waitqueue on the way out.

> > Still, changing the wakeup code in the way we've discussed
> > seems the way to go. [..]
> 
> I agree. I'm quite convinced it's right way too and of course I see why we
> moved the exclusive information in the waitqueue instead of keeping it in the
> task struct to provide sane semantics in the long run ;).

Linus suggested at one point that we clear the waitqueue's
WQ_FLAG_EXCLUSIVE bit when we wake it up, so it falls back
to non-exclusive mode.  This was to address one of the
task-on-two-waitqueues problems...

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-23 Thread Andrea Arcangeli

On Sun, Dec 24, 2000 at 01:28:59PM +1100, Andrew Morton wrote:
> This could happen with the old scheme where exclusiveness
> was stored in the task, not the waitqueue.
> 
> >From test4:
> 
> for (;;) {
> __set_current_state(TASK_UNINTERRUPTIBLE | TASK_EXCLUSIVE);
>   /* WINDOW */
> spin_lock_irq(_request_lock);
> rq = get_request(q, rw);
> spin_unlock_irq(_request_lock);
> if (rq)
> break;
> generic_unplug_device(q);
> schedule();
> }
> 
> As soon as we set TASK_UNINTERRUPTIBLE|TASK_EXCLUSIVE, this
> task becomes visible to __wake_up_common and can soak
> up a second wakeup. [..]

I disagree, none wakeup could be lost in test4 in the above section of code
(besides the __set_current_state that should be set_current_state ;) and that's
not an issue for both x86 and alpha where spin_lock is a full memory barrier).

Let's examine what happens in test4:

1) before the `for' loop, the task is just visible to __wake_up_common

2) as soon as we set the task state to TASK_UNINTERRUPTIBLE|TASK_EXCLUSIVE
   the task _start_ asking for an exclusive wakeup from __wake_up_common

At this point consider two different cases:

3a) assume we get a wakup in the /* WINDOW */: the task->state become
runnable from under us (no problem, that's expected). And if it's true we
got a wakeup in /* WINDOW */, then it means get_request will also return
a valid rq and we'll break the loop and we'll use the resource. No wakeup
lost in this case. So far so good. This was the 1 wakeup case.

3b) Assume _two_ wakeups happens in /* WINDOW */: the first one just make
us RUNNABLE as in the 3a) case, the second one will wakeup any _other_
wake-one task _after_ us in the wait queue (normal FIFO order), because the
first wakeup implicitly made us RUNNABLE (not anymore TASK_EXCLUSIVE so we
were not asking anymore for an exclusive wakeup).

The task after us may as well be running in /* WINDOW */ and in such case
it will get as well a `rq' from get_request (as in case 3a) without having
to sleep). All right.

No wakeup could be lost in any case. Two wakeups happened and two
tasks got their I/O request even if both wakeups happend in /* WINDOW */.
Example with N wakeups and N tasks is equivalent.

This race is been introduced in test1X because there the task keeps asking for
an exclusive wakeup even after getting a wakeup: until it has the time to
cleanup and unregister explicitly.

And btw, 2.2.19pre3 has the same race, while the alternate wake-one
implementation in 2.2.18aa2 doesn't have it (for the same reasons).

And /* WINDOW */ is not the only window for such race: the window is the whole
section where the task is registered in the waitqueue in exclusive mode:

add_wait_queue_exclusive
/* all is WINDOW here */
remove_wait_queue

Until remove_wait_queue runs explicitly in the task context any second wakeup
will keep waking up only such task (so in turn it will be lost). So it should
trigger very easily under high load since two wakeups will happen much faster
compared to the task that needs to be rescheduled in order run
remove_wait_queue and cleanup.

Any deadlocks in test1X could be very well explained by this race condition.

> Still, changing the wakeup code in the way we've discussed
> seems the way to go. [..]

I agree. I'm quite convinced it's right way too and of course I see why we
moved the exclusive information in the waitqueue instead of keeping it in the
task struct to provide sane semantics in the long run ;).

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-23 Thread Andrew Morton

Andrea Arcangeli wrote:
> 
> ...
> > if (rq)
> > break;
> > generic_unplug_device(q);
> > schedule();
> > }
> > remove_wait_queue(>wait_for_request, );
> > current->state = TASK_RUNNING;
> > return rq;
> > }
> >
> > If this task enters the schedule() and is then woken, and another
> > wakeup is sent to the waitqueue while this task is executing in
> > the marked window, __wake_up_common() will try to wake this
> > task a second time and will then stop looking for tasks to wake.
> >
> > The outcome: two wakeups sent to the queue, but only one task woken.
> 
> Correct.
> 
> And btw such race is new and it must been introduced in late 2.4.0-test1X or
> so, I'm sure it couldn't happen in whole 2.3.x and 2.4.0-testX because the
> wakeup was clearing atomically the exclusive bit from the task->state.
> 

This could happen with the old scheme where exclusiveness
was stored in the task, not the waitqueue.

>From test4:

for (;;) {
__set_current_state(TASK_UNINTERRUPTIBLE | TASK_EXCLUSIVE);
/* WINDOW */
spin_lock_irq(_request_lock);
rq = get_request(q, rw);
spin_unlock_irq(_request_lock);
if (rq)
break;
generic_unplug_device(q);
schedule();
}

As soon as we set TASK_UNINTERRUPTIBLE|TASK_EXCLUSIVE, this
task becomes visible to __wake_up_common and can soak
up a second wakeup.  I assume this hasn't been a reported problem
because request queues get lots of wakeups sent to them?

Still, changing the wakeup code in the way we've discussed
seems the way to go.  It also makes the extra wake_up()
at the end of x86's __down() unnecessary, which is a small
performance win - semaphores are currently wake-two.

But Linus had a different reason why that wakeup was there.
Need to dig out the email and stare at it.  But I don't see a
good reason to muck with the semaphores at this time.

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-23 Thread Andrea Arcangeli

On Sun, Dec 24, 2000 at 11:23:33AM +1100, Andrew Morton wrote:
> Andrea Arcangeli wrote:
> > 1) could be fixed trivially by making the waitqueue_lock a spinlock, but
> > this way doesn't solve 2). And if we solve 2) properly than 1) gets fixed as

BTW (follow up myself), really making the lock a spinlock (not a readwrite
lock) would fix 2) as well (waitqueue_lock is global in 2.2.x I was thinking at
the per-waitqueue lock of 2.4.x ;).

> > well.
> 
> I don't understand the problem with 2) in 2.2?  Every task on both waitqueues
> gets woken up.  Won't it sort itself out OK?

Not every task if it's a wake-one on both waitqueues. The problem should be the
same in 2.2.x and 2.4.x. But if such usage makes sense is uncertain...

> For 2.4, 2) is an issue because we can have tasks on two waitqueues at the
> same time, with a mix of exclusive and non.  Putting a global spinlock
> into __wake_up_common would fix it, but was described as "horrid" by
> you-know-who :)

Yes. And that wouldn't fix the race number 3) below.

> > I agree the right fix for 2) (and in turn for 1) ) is to count the number of
> > exclusive wake_up_process that moves the task in the runqueue, if the task was
> > just in the runqueue we must not consider it as an exclusive wakeup (so in turn
> > we'll try again to wakeup the next exclusive-wakeup waiter). This will
> > fix both races. Since the fix is self contained in __wake_up it's fine
> > for 2.2.19pre3 as well and we can keep using a read_write lock then.
> 
> I really like this approach.  It fixes another problem in 2.4:
> 
> Example:
> 
> static struct request *__get_request_wait(request_queue_t *q, int rw)
> {
> register struct request *rq;
> DECLARE_WAITQUEUE(wait, current);
> 
> add_wait_queue_exclusive(>wait_for_request, );
> for (;;) {
> __set_current_state(TASK_UNINTERRUPTIBLE);
>   /* WINDOW HERE */
> spin_lock_irq(_request_lock);
> rq = get_request(q, rw);
> spin_unlock_irq(_request_lock);

note that the above is racy and can lose a wakeup, 2.4.x needs
set_current_state there (not __set_current_state): spin_lock isn't a two-way
barrier, it only forbids stuff ot exit the critical section. So on some
architecture (not on the alpha for example) the cpu could reorder the code
this way:

spin_lock_irq()
rq = get_request
__set_current_state
spin_unlock_irq

So inverting the order of operations. That needs to be fixed too (luckily
it's a one liner).

> if (rq)
> break;
> generic_unplug_device(q);
> schedule();
> }
> remove_wait_queue(>wait_for_request, );
> current->state = TASK_RUNNING;
> return rq;
> }
> 
> If this task enters the schedule() and is then woken, and another
> wakeup is sent to the waitqueue while this task is executing in
> the marked window, __wake_up_common() will try to wake this
> task a second time and will then stop looking for tasks to wake.
> 
> The outcome: two wakeups sent to the queue, but only one task woken.

Correct.

And btw such race is new and it must been introduced in late 2.4.0-test1X or
so, I'm sure it couldn't happen in whole 2.3.x and 2.4.0-testX because the
wakeup was clearing atomically the exclusive bit from the task->state.

Still talking about late 2.4.x changes, why add_wait_queue_exclusive gone
in kernel/fork.c?? That's obviously not the right place :).

> I haven't thought about it super-hard, but I think that if
> __wake_up_common's exclusive-mode handling were changed
> as you describe, so that it keeps on scanning the queue until it has
> *definitely* moved a task onto the runqueue then this
> problem goes away.

Yes, that's true.

> > Those races of course are orthogonal with the issue we discussed previously
> > in this thread: a task registered in two waitqueues and wanting an exclusive
> > wakeup from one waitqueue and a wake-all from the other waitqueue (for
> > addressing that we need to move the wake-one information from the task struct
> > to the waitqueue_head and I still think that shoudln't be addressed in 2.2.x,
> > 2.2.x is fine with a per-task-struct wake-one information)
> 
> OK by me, as long as people don't uncautiously start using the
> capability for other things.
> 
> > Should I take care of the 2.2.x fix, or will you take care of it? I'm not using
> > the wake-one patch in 2.2.19pre3 because I don't like it (starting from the
> > useless wmb() in accept) so if you want to take care of 2.2.19pre3 yourself I'd
> > suggest to apply the wake-one patch against 2.2.19pre3 in my ftp-patch area
> > first.  Otherwise give me an ack and I'll extend myself my wake-one patch to
> > ignore the wake_up_process()es that doesn't move the task in the runqueue.
> 
> ack.
> 
> I'll take another look at the 2.4 patch and ask you to review that
> when I've finished with the netdevice wetworks, if 

Re: Linux 2.2.19pre2

2000-12-23 Thread Andrew Morton

Andrea Arcangeli wrote:
> 
> On Sat, Dec 23, 2000 at 05:56:42PM +1100, Andrew Morton wrote:
> > If we elect to not address this problem in 2.2 and to rely upon the network
> 
> I see. There are two races:
> 
> 1)  race inside __wake_up when it's run on the same waitqueue: 2.2.19pre3
> is affected as well as 2.2.18aa2, and 2.4.x is affected
> as well when #defining USE_RW_WAIT_QUEUE_SPINLOCK to 1.

mm...  I think we should kill the USE_RW_WAIT_QUEUE_SPINLOCK option.

> 2)  race between two parallel __wake_up running on different waitqueues
> (both 2.2.x and 2.4.x are affected)
> 
> 1) could be fixed trivially by making the waitqueue_lock a spinlock, but
> this way doesn't solve 2). And if we solve 2) properly than 1) gets fixed as
> well.

I don't understand the problem with 2) in 2.2?  Every task on both waitqueues
gets woken up.  Won't it sort itself out OK?

For 2.4, 2) is an issue because we can have tasks on two waitqueues at the
same time, with a mix of exclusive and non.  Putting a global spinlock
into __wake_up_common would fix it, but was described as "horrid" by
you-know-who :)

> I agree the right fix for 2) (and in turn for 1) ) is to count the number of
> exclusive wake_up_process that moves the task in the runqueue, if the task was
> just in the runqueue we must not consider it as an exclusive wakeup (so in turn
> we'll try again to wakeup the next exclusive-wakeup waiter). This will
> fix both races. Since the fix is self contained in __wake_up it's fine
> for 2.2.19pre3 as well and we can keep using a read_write lock then.

I really like this approach.  It fixes another problem in 2.4:

Example:

static struct request *__get_request_wait(request_queue_t *q, int rw)
{
register struct request *rq;
DECLARE_WAITQUEUE(wait, current);

add_wait_queue_exclusive(>wait_for_request, );
for (;;) {
__set_current_state(TASK_UNINTERRUPTIBLE);
/* WINDOW HERE */
spin_lock_irq(_request_lock);
rq = get_request(q, rw);
spin_unlock_irq(_request_lock);
if (rq)
break;
generic_unplug_device(q);
schedule();
}
remove_wait_queue(>wait_for_request, );
current->state = TASK_RUNNING;
return rq;
}

If this task enters the schedule() and is then woken, and another
wakeup is sent to the waitqueue while this task is executing in
the marked window, __wake_up_common() will try to wake this
task a second time and will then stop looking for tasks to wake.

The outcome: two wakeups sent to the queue, but only one task woken.

I haven't thought about it super-hard, but I think that if
__wake_up_common's exclusive-mode handling were changed
as you describe, so that it keeps on scanning the queue until it has
*definitely* moved a task onto the runqueue then this
problem goes away.

> Those races of course are orthogonal with the issue we discussed previously
> in this thread: a task registered in two waitqueues and wanting an exclusive
> wakeup from one waitqueue and a wake-all from the other waitqueue (for
> addressing that we need to move the wake-one information from the task struct
> to the waitqueue_head and I still think that shoudln't be addressed in 2.2.x,
> 2.2.x is fine with a per-task-struct wake-one information)

OK by me, as long as people don't uncautiously start using the
capability for other things.

> Should I take care of the 2.2.x fix, or will you take care of it? I'm not using
> the wake-one patch in 2.2.19pre3 because I don't like it (starting from the
> useless wmb() in accept) so if you want to take care of 2.2.19pre3 yourself I'd
> suggest to apply the wake-one patch against 2.2.19pre3 in my ftp-patch area
> first.  Otherwise give me an ack and I'll extend myself my wake-one patch to
> ignore the wake_up_process()es that doesn't move the task in the runqueue.

ack.

I'll take another look at the 2.4 patch and ask you to review that
when I've finished with the netdevice wetworks, if that's
OK.

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-23 Thread Andrea Arcangeli

On Sat, Dec 23, 2000 at 05:56:42PM +1100, Andrew Morton wrote:
> If we elect to not address this problem in 2.2 and to rely upon the network

I see. There are two races:

1)  race inside __wake_up when it's run on the same waitqueue: 2.2.19pre3
is affected as well as 2.2.18aa2, and 2.4.x is affected
as well when #defining USE_RW_WAIT_QUEUE_SPINLOCK to 1.
2)  race between two parallel __wake_up running on different waitqueues
(both 2.2.x and 2.4.x are affected)

1) could be fixed trivially by making the waitqueue_lock a spinlock, but
this way doesn't solve 2). And if we solve 2) properly than 1) gets fixed as
well.

I agree the right fix for 2) (and in turn for 1) ) is to count the number of
exclusive wake_up_process that moves the task in the runqueue, if the task was
just in the runqueue we must not consider it as an exclusive wakeup (so in turn
we'll try again to wakeup the next exclusive-wakeup waiter). This will
fix both races. Since the fix is self contained in __wake_up it's fine
for 2.2.19pre3 as well and we can keep using a read_write lock then.

Those races of course are orthogonal with the issue we discussed previously
in this thread: a task registered in two waitqueues and wanting an exclusive
wakeup from one waitqueue and a wake-all from the other waitqueue (for
addressing that we need to move the wake-one information from the task struct
to the waitqueue_head and I still think that shoudln't be addressed in 2.2.x,
2.2.x is fine with a per-task-struct wake-one information)

Should I take care of the 2.2.x fix, or will you take care of it? I'm not using
the wake-one patch in 2.2.19pre3 because I don't like it (starting from the
useless wmb() in accept) so if you want to take care of 2.2.19pre3 yourself I'd
suggest to apply the wake-one patch against 2.2.19pre3 in my ftp-patch area
first.  Otherwise give me an ack and I'll extend myself my wake-one patch to
ignore the wake_up_process()es that doesn't move the task in the runqueue.

Thanks,
Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-23 Thread Andrea Arcangeli

On Sat, Dec 23, 2000 at 05:56:42PM +1100, Andrew Morton wrote:
 If we elect to not address this problem in 2.2 and to rely upon the network

I see. There are two races:

1)  race inside __wake_up when it's run on the same waitqueue: 2.2.19pre3
is affected as well as 2.2.18aa2, and 2.4.x is affected
as well when #defining USE_RW_WAIT_QUEUE_SPINLOCK to 1.
2)  race between two parallel __wake_up running on different waitqueues
(both 2.2.x and 2.4.x are affected)

1) could be fixed trivially by making the waitqueue_lock a spinlock, but
this way doesn't solve 2). And if we solve 2) properly than 1) gets fixed as
well.

I agree the right fix for 2) (and in turn for 1) ) is to count the number of
exclusive wake_up_process that moves the task in the runqueue, if the task was
just in the runqueue we must not consider it as an exclusive wakeup (so in turn
we'll try again to wakeup the next exclusive-wakeup waiter). This will
fix both races. Since the fix is self contained in __wake_up it's fine
for 2.2.19pre3 as well and we can keep using a read_write lock then.

Those races of course are orthogonal with the issue we discussed previously
in this thread: a task registered in two waitqueues and wanting an exclusive
wakeup from one waitqueue and a wake-all from the other waitqueue (for
addressing that we need to move the wake-one information from the task struct
to the waitqueue_head and I still think that shoudln't be addressed in 2.2.x,
2.2.x is fine with a per-task-struct wake-one information)

Should I take care of the 2.2.x fix, or will you take care of it? I'm not using
the wake-one patch in 2.2.19pre3 because I don't like it (starting from the
useless wmb() in accept) so if you want to take care of 2.2.19pre3 yourself I'd
suggest to apply the wake-one patch against 2.2.19pre3 in my ftp-patch area
first.  Otherwise give me an ack and I'll extend myself my wake-one patch to
ignore the wake_up_process()es that doesn't move the task in the runqueue.

Thanks,
Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-23 Thread Andrew Morton

Andrea Arcangeli wrote:
 
 On Sat, Dec 23, 2000 at 05:56:42PM +1100, Andrew Morton wrote:
  If we elect to not address this problem in 2.2 and to rely upon the network
 
 I see. There are two races:
 
 1)  race inside __wake_up when it's run on the same waitqueue: 2.2.19pre3
 is affected as well as 2.2.18aa2, and 2.4.x is affected
 as well when #defining USE_RW_WAIT_QUEUE_SPINLOCK to 1.

mm...  I think we should kill the USE_RW_WAIT_QUEUE_SPINLOCK option.

 2)  race between two parallel __wake_up running on different waitqueues
 (both 2.2.x and 2.4.x are affected)
 
 1) could be fixed trivially by making the waitqueue_lock a spinlock, but
 this way doesn't solve 2). And if we solve 2) properly than 1) gets fixed as
 well.

I don't understand the problem with 2) in 2.2?  Every task on both waitqueues
gets woken up.  Won't it sort itself out OK?

For 2.4, 2) is an issue because we can have tasks on two waitqueues at the
same time, with a mix of exclusive and non.  Putting a global spinlock
into __wake_up_common would fix it, but was described as "horrid" by
you-know-who :)

 I agree the right fix for 2) (and in turn for 1) ) is to count the number of
 exclusive wake_up_process that moves the task in the runqueue, if the task was
 just in the runqueue we must not consider it as an exclusive wakeup (so in turn
 we'll try again to wakeup the next exclusive-wakeup waiter). This will
 fix both races. Since the fix is self contained in __wake_up it's fine
 for 2.2.19pre3 as well and we can keep using a read_write lock then.

I really like this approach.  It fixes another problem in 2.4:

Example:

static struct request *__get_request_wait(request_queue_t *q, int rw)
{
register struct request *rq;
DECLARE_WAITQUEUE(wait, current);

add_wait_queue_exclusive(q-wait_for_request, wait);
for (;;) {
__set_current_state(TASK_UNINTERRUPTIBLE);
/* WINDOW HERE */
spin_lock_irq(io_request_lock);
rq = get_request(q, rw);
spin_unlock_irq(io_request_lock);
if (rq)
break;
generic_unplug_device(q);
schedule();
}
remove_wait_queue(q-wait_for_request, wait);
current-state = TASK_RUNNING;
return rq;
}

If this task enters the schedule() and is then woken, and another
wakeup is sent to the waitqueue while this task is executing in
the marked window, __wake_up_common() will try to wake this
task a second time and will then stop looking for tasks to wake.

The outcome: two wakeups sent to the queue, but only one task woken.

I haven't thought about it super-hard, but I think that if
__wake_up_common's exclusive-mode handling were changed
as you describe, so that it keeps on scanning the queue until it has
*definitely* moved a task onto the runqueue then this
problem goes away.

 Those races of course are orthogonal with the issue we discussed previously
 in this thread: a task registered in two waitqueues and wanting an exclusive
 wakeup from one waitqueue and a wake-all from the other waitqueue (for
 addressing that we need to move the wake-one information from the task struct
 to the waitqueue_head and I still think that shoudln't be addressed in 2.2.x,
 2.2.x is fine with a per-task-struct wake-one information)

OK by me, as long as people don't uncautiously start using the
capability for other things.

 Should I take care of the 2.2.x fix, or will you take care of it? I'm not using
 the wake-one patch in 2.2.19pre3 because I don't like it (starting from the
 useless wmb() in accept) so if you want to take care of 2.2.19pre3 yourself I'd
 suggest to apply the wake-one patch against 2.2.19pre3 in my ftp-patch area
 first.  Otherwise give me an ack and I'll extend myself my wake-one patch to
 ignore the wake_up_process()es that doesn't move the task in the runqueue.

ack.

I'll take another look at the 2.4 patch and ask you to review that
when I've finished with the netdevice wetworks, if that's
OK.

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-23 Thread Andrea Arcangeli

On Sun, Dec 24, 2000 at 11:23:33AM +1100, Andrew Morton wrote:
 Andrea Arcangeli wrote:
  1) could be fixed trivially by making the waitqueue_lock a spinlock, but
  this way doesn't solve 2). And if we solve 2) properly than 1) gets fixed as

BTW (follow up myself), really making the lock a spinlock (not a readwrite
lock) would fix 2) as well (waitqueue_lock is global in 2.2.x I was thinking at
the per-waitqueue lock of 2.4.x ;).

  well.
 
 I don't understand the problem with 2) in 2.2?  Every task on both waitqueues
 gets woken up.  Won't it sort itself out OK?

Not every task if it's a wake-one on both waitqueues. The problem should be the
same in 2.2.x and 2.4.x. But if such usage makes sense is uncertain...

 For 2.4, 2) is an issue because we can have tasks on two waitqueues at the
 same time, with a mix of exclusive and non.  Putting a global spinlock
 into __wake_up_common would fix it, but was described as "horrid" by
 you-know-who :)

Yes. And that wouldn't fix the race number 3) below.

  I agree the right fix for 2) (and in turn for 1) ) is to count the number of
  exclusive wake_up_process that moves the task in the runqueue, if the task was
  just in the runqueue we must not consider it as an exclusive wakeup (so in turn
  we'll try again to wakeup the next exclusive-wakeup waiter). This will
  fix both races. Since the fix is self contained in __wake_up it's fine
  for 2.2.19pre3 as well and we can keep using a read_write lock then.
 
 I really like this approach.  It fixes another problem in 2.4:
 
 Example:
 
 static struct request *__get_request_wait(request_queue_t *q, int rw)
 {
 register struct request *rq;
 DECLARE_WAITQUEUE(wait, current);
 
 add_wait_queue_exclusive(q-wait_for_request, wait);
 for (;;) {
 __set_current_state(TASK_UNINTERRUPTIBLE);
   /* WINDOW HERE */
 spin_lock_irq(io_request_lock);
 rq = get_request(q, rw);
 spin_unlock_irq(io_request_lock);

note that the above is racy and can lose a wakeup, 2.4.x needs
set_current_state there (not __set_current_state): spin_lock isn't a two-way
barrier, it only forbids stuff ot exit the critical section. So on some
architecture (not on the alpha for example) the cpu could reorder the code
this way:

spin_lock_irq()
rq = get_request
__set_current_state
spin_unlock_irq

So inverting the order of operations. That needs to be fixed too (luckily
it's a one liner).

 if (rq)
 break;
 generic_unplug_device(q);
 schedule();
 }
 remove_wait_queue(q-wait_for_request, wait);
 current-state = TASK_RUNNING;
 return rq;
 }
 
 If this task enters the schedule() and is then woken, and another
 wakeup is sent to the waitqueue while this task is executing in
 the marked window, __wake_up_common() will try to wake this
 task a second time and will then stop looking for tasks to wake.
 
 The outcome: two wakeups sent to the queue, but only one task woken.

Correct.

And btw such race is new and it must been introduced in late 2.4.0-test1X or
so, I'm sure it couldn't happen in whole 2.3.x and 2.4.0-testX because the
wakeup was clearing atomically the exclusive bit from the task-state.

Still talking about late 2.4.x changes, why add_wait_queue_exclusive gone
in kernel/fork.c?? That's obviously not the right place :).

 I haven't thought about it super-hard, but I think that if
 __wake_up_common's exclusive-mode handling were changed
 as you describe, so that it keeps on scanning the queue until it has
 *definitely* moved a task onto the runqueue then this
 problem goes away.

Yes, that's true.

  Those races of course are orthogonal with the issue we discussed previously
  in this thread: a task registered in two waitqueues and wanting an exclusive
  wakeup from one waitqueue and a wake-all from the other waitqueue (for
  addressing that we need to move the wake-one information from the task struct
  to the waitqueue_head and I still think that shoudln't be addressed in 2.2.x,
  2.2.x is fine with a per-task-struct wake-one information)
 
 OK by me, as long as people don't uncautiously start using the
 capability for other things.
 
  Should I take care of the 2.2.x fix, or will you take care of it? I'm not using
  the wake-one patch in 2.2.19pre3 because I don't like it (starting from the
  useless wmb() in accept) so if you want to take care of 2.2.19pre3 yourself I'd
  suggest to apply the wake-one patch against 2.2.19pre3 in my ftp-patch area
  first.  Otherwise give me an ack and I'll extend myself my wake-one patch to
  ignore the wake_up_process()es that doesn't move the task in the runqueue.
 
 ack.
 
 I'll take another look at the 2.4 patch and ask you to review that
 when I've finished with the netdevice wetworks, if that's
 OK.

OK. Thanks for the help.

Andrea
-
To unsubscribe from this list: send the 

Re: Linux 2.2.19pre2

2000-12-23 Thread Andrew Morton

Andrea Arcangeli wrote:
 
 ...
  if (rq)
  break;
  generic_unplug_device(q);
  schedule();
  }
  remove_wait_queue(q-wait_for_request, wait);
  current-state = TASK_RUNNING;
  return rq;
  }
 
  If this task enters the schedule() and is then woken, and another
  wakeup is sent to the waitqueue while this task is executing in
  the marked window, __wake_up_common() will try to wake this
  task a second time and will then stop looking for tasks to wake.
 
  The outcome: two wakeups sent to the queue, but only one task woken.
 
 Correct.
 
 And btw such race is new and it must been introduced in late 2.4.0-test1X or
 so, I'm sure it couldn't happen in whole 2.3.x and 2.4.0-testX because the
 wakeup was clearing atomically the exclusive bit from the task-state.
 

This could happen with the old scheme where exclusiveness
was stored in the task, not the waitqueue.

From test4:

for (;;) {
__set_current_state(TASK_UNINTERRUPTIBLE | TASK_EXCLUSIVE);
/* WINDOW */
spin_lock_irq(io_request_lock);
rq = get_request(q, rw);
spin_unlock_irq(io_request_lock);
if (rq)
break;
generic_unplug_device(q);
schedule();
}

As soon as we set TASK_UNINTERRUPTIBLE|TASK_EXCLUSIVE, this
task becomes visible to __wake_up_common and can soak
up a second wakeup.  I assume this hasn't been a reported problem
because request queues get lots of wakeups sent to them?

Still, changing the wakeup code in the way we've discussed
seems the way to go.  It also makes the extra wake_up()
at the end of x86's __down() unnecessary, which is a small
performance win - semaphores are currently wake-two.

But Linus had a different reason why that wakeup was there.
Need to dig out the email and stare at it.  But I don't see a
good reason to muck with the semaphores at this time.

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-23 Thread Andrea Arcangeli

On Sun, Dec 24, 2000 at 01:28:59PM +1100, Andrew Morton wrote:
 This could happen with the old scheme where exclusiveness
 was stored in the task, not the waitqueue.
 
 From test4:
 
 for (;;) {
 __set_current_state(TASK_UNINTERRUPTIBLE | TASK_EXCLUSIVE);
   /* WINDOW */
 spin_lock_irq(io_request_lock);
 rq = get_request(q, rw);
 spin_unlock_irq(io_request_lock);
 if (rq)
 break;
 generic_unplug_device(q);
 schedule();
 }
 
 As soon as we set TASK_UNINTERRUPTIBLE|TASK_EXCLUSIVE, this
 task becomes visible to __wake_up_common and can soak
 up a second wakeup. [..]

I disagree, none wakeup could be lost in test4 in the above section of code
(besides the __set_current_state that should be set_current_state ;) and that's
not an issue for both x86 and alpha where spin_lock is a full memory barrier).

Let's examine what happens in test4:

1) before the `for' loop, the task is just visible to __wake_up_common

2) as soon as we set the task state to TASK_UNINTERRUPTIBLE|TASK_EXCLUSIVE
   the task _start_ asking for an exclusive wakeup from __wake_up_common

At this point consider two different cases:

3a) assume we get a wakup in the /* WINDOW */: the task-state become
runnable from under us (no problem, that's expected). And if it's true we
got a wakeup in /* WINDOW */, then it means get_request will also return
a valid rq and we'll break the loop and we'll use the resource. No wakeup
lost in this case. So far so good. This was the 1 wakeup case.

3b) Assume _two_ wakeups happens in /* WINDOW */: the first one just make
us RUNNABLE as in the 3a) case, the second one will wakeup any _other_
wake-one task _after_ us in the wait queue (normal FIFO order), because the
first wakeup implicitly made us RUNNABLE (not anymore TASK_EXCLUSIVE so we
were not asking anymore for an exclusive wakeup).

The task after us may as well be running in /* WINDOW */ and in such case
it will get as well a `rq' from get_request (as in case 3a) without having
to sleep). All right.

No wakeup could be lost in any case. Two wakeups happened and two
tasks got their I/O request even if both wakeups happend in /* WINDOW */.
Example with N wakeups and N tasks is equivalent.

This race is been introduced in test1X because there the task keeps asking for
an exclusive wakeup even after getting a wakeup: until it has the time to
cleanup and unregister explicitly.

And btw, 2.2.19pre3 has the same race, while the alternate wake-one
implementation in 2.2.18aa2 doesn't have it (for the same reasons).

And /* WINDOW */ is not the only window for such race: the window is the whole
section where the task is registered in the waitqueue in exclusive mode:

add_wait_queue_exclusive
/* all is WINDOW here */
remove_wait_queue

Until remove_wait_queue runs explicitly in the task context any second wakeup
will keep waking up only such task (so in turn it will be lost). So it should
trigger very easily under high load since two wakeups will happen much faster
compared to the task that needs to be rescheduled in order run
remove_wait_queue and cleanup.

Any deadlocks in test1X could be very well explained by this race condition.

 Still, changing the wakeup code in the way we've discussed
 seems the way to go. [..]

I agree. I'm quite convinced it's right way too and of course I see why we
moved the exclusive information in the waitqueue instead of keeping it in the
task struct to provide sane semantics in the long run ;).

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-23 Thread Andrew Morton

Andrea Arcangeli wrote:
 
 On Sun, Dec 24, 2000 at 01:28:59PM +1100, Andrew Morton wrote:
  This could happen with the old scheme where exclusiveness
  was stored in the task, not the waitqueue.
 
  From test4:
 
  for (;;) {
  __set_current_state(TASK_UNINTERRUPTIBLE | TASK_EXCLUSIVE);
/* WINDOW */
  spin_lock_irq(io_request_lock);
  rq = get_request(q, rw);
  spin_unlock_irq(io_request_lock);
  if (rq)
  break;
  generic_unplug_device(q);
  schedule();
  }
 
  As soon as we set TASK_UNINTERRUPTIBLE|TASK_EXCLUSIVE, this
  task becomes visible to __wake_up_common and can soak
  up a second wakeup. [..]
 
 I disagree, none wakeup could be lost in test4 in the above section of code
 (besides the __set_current_state that should be set_current_state ;) and that's
 not an issue for both x86 and alpha where spin_lock is a full memory barrier).
 
 Let's examine what happens in test4:
 
 [ snip ]

I was talking about a different scenario:

add_wait_queue_exclusive(q-wait_for_request, wait);
for (;;) {
__set_current_state(TASK_UNINTERRUPTIBLE);
/* WINDOW */
spin_lock_irq(io_request_lock);
rq = get_request(q, rw);
spin_unlock_irq(io_request_lock);
if (rq)
break;
generic_unplug_device(q);
schedule();
}
remove_wait_queue(q-wait_for_request, wait);

Suppose there are two tasks sleeping in the schedule().

A wakeup comes.  One task wakes.  It loops aound and reaches
the window.  At this point in time, another wakeup gets sent
to the waitqueue. It gets directed to the task which just
woke up!  It should have been directed to a sleeping task,
not the current one which just *looks* like it's sleeping,
because it's in state TASK_UNINTERRUPTIBLE.

This can happen in test4.

 This race is been introduced in test1X because there the task keeps asking for
 an exclusive wakeup even after getting a wakeup: until it has the time to
 cleanup and unregister explicitly.

No, I think it's the same in test4 and test1X.  In current kernels
it's still the case that the woken task gets atomically switched into
state TASK_RUNNING, and then becomes "hidden" from the wakeup code.

The problem is, the task bogusly sets itself back into
TASK_UNINTERRUPTIBLE for a very short period and becomes
eligible for another wakeup.

There's not much which ll_rw_blk.c can do about this.

 And btw, 2.2.19pre3 has the same race, while the alternate wake-one
 implementation in 2.2.18aa2 doesn't have it (for the same reasons).

The above scenario can happen in all kernels.

 And /* WINDOW */ is not the only window for such race: the window is the whole
 section where the task is registered in the waitqueue in exclusive mode:
 
 add_wait_queue_exclusive
 /* all is WINDOW here */
 remove_wait_queue
 
 Until remove_wait_queue runs explicitly in the task context any second wakeup
 will keep waking up only such task (so in turn it will be lost). So it should
 trigger very easily under high load since two wakeups will happen much faster
 compared to the task that needs to be rescheduled in order run
 remove_wait_queue and cleanup.
 
 Any deadlocks in test1X could be very well explained by this race condition.

Yes, but I haven't seen a "stuck in D state" report for a while.
I assume this is because this waitqueue gets lots of wakeups sent to it.

The ones in networking I expect are protected by other locking
which serialises the wakeups.

The one in the x86 semaphores avoids it by sending an extra wakeup
to the waitqueue on the way out.

  Still, changing the wakeup code in the way we've discussed
  seems the way to go. [..]
 
 I agree. I'm quite convinced it's right way too and of course I see why we
 moved the exclusive information in the waitqueue instead of keeping it in the
 task struct to provide sane semantics in the long run ;).

Linus suggested at one point that we clear the waitqueue's
WQ_FLAG_EXCLUSIVE bit when we wake it up, so it falls back
to non-exclusive mode.  This was to address one of the
task-on-two-waitqueues problems...

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-22 Thread Andrew Morton

Andrea Arcangeli wrote:
> 
> [ 2.2 waitqueue stuff ]
>

Andrea, it occurs to me...

__wake_up() is using read_lock(waitqueue_lock).  This means that
two CPUs could simultaneously run __wake_up().  They could both
find the same task and they could both try to wake it up. The
net result would be two wakeup events, but only one task woken
up.  That's a lost wakeup.

Now, it's probably the case that the 2.2 network serialisation
prevents this from happening - I haven't looked.  If not,
then we need to use spin_lock_irqsave.

Alternatively, we can teach wake_up_process to return a success
value if it successfully moved a task onto the runqueue.  Test that
in __wake_up and keep on going if it's zero.

2.4 needs this as well, BTW, to fix an SMP race where a task is on two
waitqueues at the same time.  I did a patch which took the "wake_up_process
returns success value" approach and it felt clean.  I haven't submitted
it due to general end-of-year patch exhaustion :)

If we elect to not address this problem in 2.2 and to rely upon the network
locking then we need to put a BIG FAT warning in the code somewhere, because
if someone else tries to use the new wake-one capability, they may not be
so lucky.

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-22 Thread Andrea Arcangeli

On Fri, Dec 22, 2000 at 06:33:00PM +1100, Andrew Morton wrote:
> add_waitqueue_exclusive() and TASK_EXCLUSIVE, add a

There's no add_waitqueue_exclusive in my patch.

> Except for this bit, which looks slightly fatal:
> 
>   /*
>  * We can drop the read-lock early if this
>  * is the only/last process.
>  */
> if (next == head) {
>  read_unlock(_lock);
>  wake_up_process(p);
>  goto out;
> }
> 
> Once the waitqueue_lock has been dropped, the task at `p'
> is free to remove itself from the waitqueue and exit.  This
> CPU can then try to wake up a non-existent task, no?

Yes, that was an unlikely-to-happen SMP race I inerith from 2.2.18 and all
previous 2.2.x vanilla kernels. Thanks.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-22 Thread Andrea Arcangeli

On Fri, Dec 22, 2000 at 06:33:00PM +1100, Andrew Morton wrote:
 add_waitqueue_exclusive() and TASK_EXCLUSIVE, add a

There's no add_waitqueue_exclusive in my patch.

 Except for this bit, which looks slightly fatal:
 
   /*
  * We can drop the read-lock early if this
  * is the only/last process.
  */
 if (next == head) {
  read_unlock(waitqueue_lock);
  wake_up_process(p);
  goto out;
 }
 
 Once the waitqueue_lock has been dropped, the task at `p'
 is free to remove itself from the waitqueue and exit.  This
 CPU can then try to wake up a non-existent task, no?

Yes, that was an unlikely-to-happen SMP race I inerith from 2.2.18 and all
previous 2.2.x vanilla kernels. Thanks.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-22 Thread Andrew Morton

Andrea Arcangeli wrote:
 
 [ 2.2 waitqueue stuff ]


Andrea, it occurs to me...

__wake_up() is using read_lock(waitqueue_lock).  This means that
two CPUs could simultaneously run __wake_up().  They could both
find the same task and they could both try to wake it up. The
net result would be two wakeup events, but only one task woken
up.  That's a lost wakeup.

Now, it's probably the case that the 2.2 network serialisation
prevents this from happening - I haven't looked.  If not,
then we need to use spin_lock_irqsave.

Alternatively, we can teach wake_up_process to return a success
value if it successfully moved a task onto the runqueue.  Test that
in __wake_up and keep on going if it's zero.

2.4 needs this as well, BTW, to fix an SMP race where a task is on two
waitqueues at the same time.  I did a patch which took the "wake_up_process
returns success value" approach and it felt clean.  I haven't submitted
it due to general end-of-year patch exhaustion :)

If we elect to not address this problem in 2.2 and to rely upon the network
locking then we need to put a BIG FAT warning in the code somewhere, because
if someone else tries to use the new wake-one capability, they may not be
so lucky.

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-21 Thread Andrew Morton

Andrea Arcangeli wrote:
> 
> > > Other thing about your patch, adding TASK_EXCLUSIVE to
> > > wake_up/wake_up_interruptible is useless.
> >
> > This enables wake_up_all().
> 
> It is useless as it is in 2.2.19pre2: there's no wake_up_all in 2.2.19pre2.

#define wake_up_all(x) __wake_up((x),TASK_UNINTERRUPTIBLE | TASK_INTERRUPTIBLE)

> > Anyway, this is all just noise.
> >
> > The key question is: which of the following do we want?
> >
> > a) A simple, specific accept()-accelerator, and 2.2 remains without
> >an exclusive wq API or
> 
> To make the accellerator we need a minimal wake-one support. So a) doesn't
> make sense to me.

It makes heaps of sense.  We've introduced into 2.2 an API
which has the same appearance as one in 2.4, but which is
subtly broken wrt the 2.4 one.

I suggest you change the names to something other than
add_waitqueue_exclusive() and TASK_EXCLUSIVE, add a
cautionary comment and then go ahead with your patch.

Except for this bit, which looks slightly fatal:

/*
 * We can drop the read-lock early if this
 * is the only/last process.
 */
if (next == head) {
 read_unlock(_lock);
 wake_up_process(p);
 goto out;
}

Once the waitqueue_lock has been dropped, the task at `p'
is free to remove itself from the waitqueue and exit.  This
CPU can then try to wake up a non-existent task, no?

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-21 Thread Andrea Arcangeli

On Wed, Dec 20, 2000 at 02:28:58PM +0100, Andrea Arcangeli wrote:
> I was in the process of fixing this (I also just backported the thinkpad
> %edx clobber fix), but if somebody is going to work on this please let
> me know so we stay in sync.

Ok this should fix the e820 memory detection, against 2.2.19pre2:


ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.2/2.2.19pre2/e820-fix-1

While fixing the code I noticed some bug was inerith from 2.4.x so I forward
ported the fixes to 2.4.0-test13-pre3:


ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.4/2.4.0-test13-pre3/e820-fix-1

I also include them below so they're handy for Linus:

diff -urN 2.4.0-test13-pre3/arch/i386/kernel/setup.c 
2.4.0-test13-pre3-e820/arch/i386/kernel/setup.c
--- 2.4.0-test13-pre3/arch/i386/kernel/setup.c  Thu Dec 14 22:33:59 2000
+++ 2.4.0-test13-pre3-e820/arch/i386/kernel/setup.c Thu Dec 21 21:12:47 2000
@@ -477,7 +477,7 @@
if (start < 0x10ULL && end > 0xAULL) {
if (start < 0xAULL)
add_memory_region(start, 0xAULL-start, 
type);
-   if (end < 0x10ULL)
+   if (end <= 0x10ULL)
continue;
start = 0x10ULL;
size = end - start;
@@ -518,7 +518,8 @@
 
e820.nr_map = 0;
add_memory_region(0, LOWMEMSIZE(), E820_RAM);
-   add_memory_region(HIGH_MEMORY, mem_size << 10, E820_RAM);
+   add_memory_region(HIGH_MEMORY, (mem_size << 10)-HIGH_MEMORY,
+ E820_RAM);
}
printk("BIOS-provided physical RAM map:\n");
print_memory_map(who);


The above patches doesn't include the fix for the thinkpad from Marc Joosen,
a backport of such bugfix is separately backported here (because it's
orthogonal with the other bugfixes):


ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.2/2.2.19pre2/thinkpad-e820-mjoosen-1

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-21 Thread Rik van Riel

On Thu, 21 Dec 2000, Andrea Arcangeli wrote:

> It would also be nice if you could show a real life
> showstopper-production-bottleneck where we need C) to fix it. I
> cannot see any useful usage of C in production 2.2.x.

Me neither.  I'm just wondering at the reason why 2.2 semantics
would be different than 2.4 ones now we still have the choice
of just cut'n'pasting over the (working) code from 2.4...

regards,

Rik
--
Hollywood goes for world dumbination,
Trailer at 11.

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com.br/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-21 Thread Andrea Arcangeli

On Thu, Dec 21, 2000 at 03:07:08PM -0200, Rik van Riel wrote:
> c) will also implement a) in an obviously right and simple way.

So go ahead. If you think that's so simple and obviously right you can post
here a patch here against 2.2.19pre2 that implements C) to show real facts.

My B is here:


ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.2/2.2.19pre2/wake-one-2

Then we will see how much C) is obviously right and simple way compared to B).

I don't need to see C) implemented to see how much it's obviously right
and simple but if you think I'm wrong again: go ahead.

It would also be nice if you could show a real life
showstopper-production-bottleneck where we need C) to fix it. I cannot see any
useful usage of C in production 2.2.x.

Doing waitqueues in 2.2.x and 2.4.x is an irrelevant point (keeping the same
API and semantics is much better than anything else for 2.2.x unless there's
some serious showstopper that isn't possible to fix with B) and that I still
cannot see).

People backporting drivers from 2.4.x will use wake-all as they had to do
during the whole 2.3.x, that's obviously safe and trivial. If they know what
they're doing they can also use the 2.2.x wake-one API if their task is
registered only in 1 waitqueues (as 99% of usages I'm aware of given
whole 2.3.x implemented B too).

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-21 Thread Rik van Riel

On Thu, 21 Dec 2000, Andrea Arcangeli wrote:

> > The key question is: which of the following do we want?
> > 
> > a) A simple, specific accept()-accelerator, and 2.2 remains without
> >an exclusive wq API or
> 
> To make the accellerator we need a minimal wake-one support. So a) doesn't
> make sense to me.
> 
> > b) A general purpose exclusive wq mechanism which does not correctly
> >support waiting on two queues simultaneuously where one is
> >exclusive or
> 
> That's what we had in whole 2.3.x and that is better suitable for 2.2.x
> as it allows to do a) with obviously right approch and minimal effort.
> 
> > c) A general purpose exclusive wq mechanism which _does_ support it.
> >
> > Each choice has merit!  You seem to want b).  davem wants c).
> 
> I have not read any email from DaveM who proposes C for
> 2.2.19pre3 (or 2.2.x in general). Are you sure he wasn't talking
> about 2.4.x?

c) will also implement a) in an obviously right and simple way.

I've still not seen ANY reason why we'd want 2.2 to have different
wake-one semantics from 2.4...

> > And given that 2.2 has maybe 2-4 years life left in it, I'd
> 
> I hope no ;).

People are still converting their 2.0 systems to 2.2 as
we speak. I really doubt that 2.2 has _less_ than 3 years
of life left ... as much as I hate this idea ;)

> > agree with David.  Let's do it once and do it right while the issue
> > is fresh in our minds.
> 
> I disagree. The changes to separate the two waitqueues even only
> for accept are not suitable for 2.2.x. Alan filter should forbid
> that changes.  They're simply not worthwhile because I cannot
> even think at ...

They're not worthwhile just because you can't think of an example ?

The same could be said of `b)' above. Do you have an example where
that is the preferable semantics ?

If both are equally preferable (ie. nobody can think of any example
where the corner case is being used), why do you keep insisting on
giving 2.2 different wake-one semantics from 2.4 ?

[these wake-one semantics may become rather important for people
backporting drivers to 2.2 over the next years ... or to something
else which nobody has even thought about ... 2-4 years is a long
time, so 2.2 maintainability is still an issue]

regards,

Rik
--
Hollywood goes for world dumbination,
Trailer at 11.

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com.br/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-21 Thread Andrea Arcangeli

On Thu, Dec 21, 2000 at 09:38:43PM +1100, Andrew Morton wrote:
> Andrea Arcangeli wrote:
> > 
> > The fact you could mix non-exclusive and exlusive wakeups in the same waitqueue
> > was a feature not a misfeature. Then of course you cannot register in two
> > waitqueues one with wake-one and one with wake-all but who does that anyways?
> > Definitely not an issue for 2.2.x.
> 
> Definitely?  Let's think about that.

If you can tell one showstopper bottleneck in 2.2.x that needs to register
in two waitqueues at the same time, one exclusive and one non-exclusive, then I
will think about it.

> > I think the real reason for spearating the two things as davem proposed is
> > because otherwise we cannot register for a LIFO wake-one in O(1) as we needed
> > for accept.
> 
> Yes.  In other words, if we try to do O(1) LIFO, we can cause lost wakeups.

We can do LIFO in O(N) just fine, no lost wakeups with the "mix" way. Only
problem is additional complexity in the inserction operation.

> > Other thing about your patch, adding TASK_EXCLUSIVE to
> > wake_up/wake_up_interruptible is useless.
> 
> This enables wake_up_all().

It is useless as it is in 2.2.19pre2: there's no wake_up_all in 2.2.19pre2.

> Anyway, this is all just noise.
> 
> The key question is: which of the following do we want?
> 
> a) A simple, specific accept()-accelerator, and 2.2 remains without
>an exclusive wq API or

To make the accellerator we need a minimal wake-one support. So a) doesn't
make sense to me.

> b) A general purpose exclusive wq mechanism which does not correctly
>support waiting on two queues simultaneuously where one is
>exclusive or

That's what we had in whole 2.3.x and that is better suitable for 2.2.x
as it allows to do a) with obviously right approch and minimal effort.

> c) A general purpose exclusive wq mechanism which _does_ support it.
>
> Each choice has merit!  You seem to want b).  davem wants c).

I have not read any email from DaveM who proposes C for 2.2.19pre3 (or 2.2.x in
general). Are you sure he wasn't talking about 2.4.x?

BTW, you also implemented b) in 2.2.19pre2 ;)

> And given that 2.2 has maybe 2-4 years life left in it, I'd

I hope no ;).

> agree with David.  Let's do it once and do it right while the issue
> is fresh in our minds.

I disagree. The changes to separate the two waitqueues even only for accept are
not suitable for 2.2.x. Alan filter should forbid that changes.  They're simply
not worthwhile because I cannot even think at one bottleneck-showstopper place
that registers in two waitqueues and wants wake-one from one and wake-all from
the other in 2.2.x.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-21 Thread Andrew Morton

Andrea Arcangeli wrote:
> 
> The fact you could mix non-exclusive and exlusive wakeups in the same waitqueue
> was a feature not a misfeature. Then of course you cannot register in two
> waitqueues one with wake-one and one with wake-all but who does that anyways?
> Definitely not an issue for 2.2.x.

Definitely?  Let's think about that.

> I think the real reason for spearating the two things as davem proposed is
> because otherwise we cannot register for a LIFO wake-one in O(1) as we needed
> for accept.

Yes.  In other words, if we try to do O(1) LIFO, we can cause lost wakeups.

> Other thing about your patch, adding TASK_EXCLUSIVE to
> wake_up/wake_up_interruptible is useless.

This enables wake_up_all().



Anyway, this is all just noise.

The key question is: which of the following do we want?

a) A simple, specific accept()-accelerator, and 2.2 remains without
   an exclusive wq API or

b) A general purpose exclusive wq mechanism which does not correctly
   support waiting on two queues simultaneuously where one is
   exclusive or

c) A general purpose exclusive wq mechanism which _does_ support it.

Each choice has merit!  You seem to want b).  davem wants c).

And given that 2.2 has maybe 2-4 years life left in it, I'd
agree with David.  Let's do it once and do it right while the issue
is fresh in our minds.

Yes?

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-21 Thread Andrew Morton

Andrea Arcangeli wrote:
 
 The fact you could mix non-exclusive and exlusive wakeups in the same waitqueue
 was a feature not a misfeature. Then of course you cannot register in two
 waitqueues one with wake-one and one with wake-all but who does that anyways?
 Definitely not an issue for 2.2.x.

Definitely?  Let's think about that.

 I think the real reason for spearating the two things as davem proposed is
 because otherwise we cannot register for a LIFO wake-one in O(1) as we needed
 for accept.

Yes.  In other words, if we try to do O(1) LIFO, we can cause lost wakeups.

 Other thing about your patch, adding TASK_EXCLUSIVE to
 wake_up/wake_up_interruptible is useless.

This enables wake_up_all().



Anyway, this is all just noise.

The key question is: which of the following do we want?

a) A simple, specific accept()-accelerator, and 2.2 remains without
   an exclusive wq API or

b) A general purpose exclusive wq mechanism which does not correctly
   support waiting on two queues simultaneuously where one is
   exclusive or

c) A general purpose exclusive wq mechanism which _does_ support it.

Each choice has merit!  You seem to want b).  davem wants c).

And given that 2.2 has maybe 2-4 years life left in it, I'd
agree with David.  Let's do it once and do it right while the issue
is fresh in our minds.

Yes?

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-21 Thread Andrea Arcangeli

On Thu, Dec 21, 2000 at 09:38:43PM +1100, Andrew Morton wrote:
 Andrea Arcangeli wrote:
  
  The fact you could mix non-exclusive and exlusive wakeups in the same waitqueue
  was a feature not a misfeature. Then of course you cannot register in two
  waitqueues one with wake-one and one with wake-all but who does that anyways?
  Definitely not an issue for 2.2.x.
 
 Definitely?  Let's think about that.

If you can tell one showstopper bottleneck in 2.2.x that needs to register
in two waitqueues at the same time, one exclusive and one non-exclusive, then I
will think about it.

  I think the real reason for spearating the two things as davem proposed is
  because otherwise we cannot register for a LIFO wake-one in O(1) as we needed
  for accept.
 
 Yes.  In other words, if we try to do O(1) LIFO, we can cause lost wakeups.

We can do LIFO in O(N) just fine, no lost wakeups with the "mix" way. Only
problem is additional complexity in the inserction operation.

  Other thing about your patch, adding TASK_EXCLUSIVE to
  wake_up/wake_up_interruptible is useless.
 
 This enables wake_up_all().

It is useless as it is in 2.2.19pre2: there's no wake_up_all in 2.2.19pre2.

 Anyway, this is all just noise.
 
 The key question is: which of the following do we want?
 
 a) A simple, specific accept()-accelerator, and 2.2 remains without
an exclusive wq API or

To make the accellerator we need a minimal wake-one support. So a) doesn't
make sense to me.

 b) A general purpose exclusive wq mechanism which does not correctly
support waiting on two queues simultaneuously where one is
exclusive or

That's what we had in whole 2.3.x and that is better suitable for 2.2.x
as it allows to do a) with obviously right approch and minimal effort.

 c) A general purpose exclusive wq mechanism which _does_ support it.

 Each choice has merit!  You seem to want b).  davem wants c).

I have not read any email from DaveM who proposes C for 2.2.19pre3 (or 2.2.x in
general). Are you sure he wasn't talking about 2.4.x?

BTW, you also implemented b) in 2.2.19pre2 ;)

 And given that 2.2 has maybe 2-4 years life left in it, I'd

I hope no ;).

 agree with David.  Let's do it once and do it right while the issue
 is fresh in our minds.

I disagree. The changes to separate the two waitqueues even only for accept are
not suitable for 2.2.x. Alan filter should forbid that changes.  They're simply
not worthwhile because I cannot even think at one bottleneck-showstopper place
that registers in two waitqueues and wants wake-one from one and wake-all from
the other in 2.2.x.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-21 Thread Rik van Riel

On Thu, 21 Dec 2000, Andrea Arcangeli wrote:

  The key question is: which of the following do we want?
  
  a) A simple, specific accept()-accelerator, and 2.2 remains without
 an exclusive wq API or
 
 To make the accellerator we need a minimal wake-one support. So a) doesn't
 make sense to me.
 
  b) A general purpose exclusive wq mechanism which does not correctly
 support waiting on two queues simultaneuously where one is
 exclusive or
 
 That's what we had in whole 2.3.x and that is better suitable for 2.2.x
 as it allows to do a) with obviously right approch and minimal effort.
 
  c) A general purpose exclusive wq mechanism which _does_ support it.
 
  Each choice has merit!  You seem to want b).  davem wants c).
 
 I have not read any email from DaveM who proposes C for
 2.2.19pre3 (or 2.2.x in general). Are you sure he wasn't talking
 about 2.4.x?

c) will also implement a) in an obviously right and simple way.

I've still not seen ANY reason why we'd want 2.2 to have different
wake-one semantics from 2.4...

  And given that 2.2 has maybe 2-4 years life left in it, I'd
 
 I hope no ;).

People are still converting their 2.0 systems to 2.2 as
we speak. I really doubt that 2.2 has _less_ than 3 years
of life left ... as much as I hate this idea ;)

  agree with David.  Let's do it once and do it right while the issue
  is fresh in our minds.
 
 I disagree. The changes to separate the two waitqueues even only
 for accept are not suitable for 2.2.x. Alan filter should forbid
 that changes.  They're simply not worthwhile because I cannot
 even think at ...

They're not worthwhile just because you can't think of an example ?

The same could be said of `b)' above. Do you have an example where
that is the preferable semantics ?

If both are equally preferable (ie. nobody can think of any example
where the corner case is being used), why do you keep insisting on
giving 2.2 different wake-one semantics from 2.4 ?

[these wake-one semantics may become rather important for people
backporting drivers to 2.2 over the next years ... or to something
else which nobody has even thought about ... 2-4 years is a long
time, so 2.2 maintainability is still an issue]

regards,

Rik
--
Hollywood goes for world dumbination,
Trailer at 11.

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com.br/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-21 Thread Andrea Arcangeli

On Thu, Dec 21, 2000 at 03:07:08PM -0200, Rik van Riel wrote:
 c) will also implement a) in an obviously right and simple way.

So go ahead. If you think that's so simple and obviously right you can post
here a patch here against 2.2.19pre2 that implements C) to show real facts.

My B is here:


ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.2/2.2.19pre2/wake-one-2

Then we will see how much C) is obviously right and simple way compared to B).

I don't need to see C) implemented to see how much it's obviously right
and simple but if you think I'm wrong again: go ahead.

It would also be nice if you could show a real life
showstopper-production-bottleneck where we need C) to fix it. I cannot see any
useful usage of C in production 2.2.x.

Doing waitqueues in 2.2.x and 2.4.x is an irrelevant point (keeping the same
API and semantics is much better than anything else for 2.2.x unless there's
some serious showstopper that isn't possible to fix with B) and that I still
cannot see).

People backporting drivers from 2.4.x will use wake-all as they had to do
during the whole 2.3.x, that's obviously safe and trivial. If they know what
they're doing they can also use the 2.2.x wake-one API if their task is
registered only in 1 waitqueues (as 99% of usages I'm aware of given
whole 2.3.x implemented B too).

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-21 Thread Andrew Morton

Andrea Arcangeli wrote:
 
   Other thing about your patch, adding TASK_EXCLUSIVE to
   wake_up/wake_up_interruptible is useless.
 
  This enables wake_up_all().
 
 It is useless as it is in 2.2.19pre2: there's no wake_up_all in 2.2.19pre2.

#define wake_up_all(x) __wake_up((x),TASK_UNINTERRUPTIBLE | TASK_INTERRUPTIBLE)

  Anyway, this is all just noise.
 
  The key question is: which of the following do we want?
 
  a) A simple, specific accept()-accelerator, and 2.2 remains without
 an exclusive wq API or
 
 To make the accellerator we need a minimal wake-one support. So a) doesn't
 make sense to me.

It makes heaps of sense.  We've introduced into 2.2 an API
which has the same appearance as one in 2.4, but which is
subtly broken wrt the 2.4 one.

I suggest you change the names to something other than
add_waitqueue_exclusive() and TASK_EXCLUSIVE, add a
cautionary comment and then go ahead with your patch.

Except for this bit, which looks slightly fatal:

/*
 * We can drop the read-lock early if this
 * is the only/last process.
 */
if (next == head) {
 read_unlock(waitqueue_lock);
 wake_up_process(p);
 goto out;
}

Once the waitqueue_lock has been dropped, the task at `p'
is free to remove itself from the waitqueue and exit.  This
CPU can then try to wake up a non-existent task, no?

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-20 Thread Andrea Arcangeli

On Wed, Dec 20, 2000 at 03:48:06PM -0200, Rik van Riel wrote:
> On Wed, 20 Dec 2000, Andrea Arcangeli wrote:
> > On Thu, Dec 21, 2000 at 01:57:15AM +1100, Andrew Morton wrote:
> > > If a task is on two waitqueues at the same time it becomes a bug:
> > > if the outer waitqueue is non-exclusive and the inner is exclusive,
> > 
> > Your 2.2.x won't allow that either. You set the
> > `current->task_exclusive = 1' and so you will get an exclusive
> > wakeups in both waitqueues. You simply cannot tell register in
> > two waitqueue and expect a non-exlusive wakeup in one and an
> > exclusive wakeup in the other one.
> 
> Why not?  Having a wake-all wakeup on one event and a
> wake-one wakeup on another kind of event seems perfectly
> reasonable to me at a first glance. Is there something
> I've overlooked ?

I think you overlooked, never mind. I only said kernel 2.2.19pre2 won't allow
that either. I'm not saying it's impossible to implement or unrasonable.

> > > Anyway, it's academic.  davem would prefer that we do it properly
> > > and move the `exclusive' flag into the waitqueue head to avoid the 
> > > task-on-two-waitqueues problem, as was done in 2.4.  I think he's
> > 
> > The fact you could mix non-exclusive and exlusive wakeups in the
> > same waitqueue was a feature not a misfeature. Then of course
> > you cannot register in two waitqueues one with wake-one and one
> > with wake-all but who does that anyways?
> 
> The "who does that anyways" argument could also be said about
> mixing exclusive and non-exclusive wakeups in the same waitqueue.
> Doing this seems rather confusing ... would you know any application
> which needs this ?

wake-one accept(2) in 2.2.x unless you want to rewrite part of the TCP stack to
still get select wake-all right (the reason they was mixed in whole 2.3.x was
the same reason we _need_ to somehow mix non-excusive and exlusive waiters in
the same waitqueue in 2.2.x to provide wake-one in accept).

And in 2.2.x the "who does that anyways" is much stronger, since _only_
accept is using the exclusive wakeup.

> I think it would be good to have the same semantics in 2.2 as
> we have in 2.4. [..]

2.3.0 born for allowing O(1) inserction in the waitqueue because only that
change generated and huge amount of details that was not possible to handle in
2.2.x.

> [..] If there is a good reason to go with the
> semantics Andrea proposed [..]

NOTE: I'm only talking about 2.2.19pre2, not 2.4.x. I didn't suggested anything
for 2.4.x and I'm totally fine with two different waitqueues. I even wanted to
differentiate them too in mid 2.3.x to fix accept that was FIFO but still
allowing insertion in the waitqueue in O(1), but didn't had the time to rework
the TCP stack and the rest of wake-one users.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-20 Thread Rik van Riel

On Wed, 20 Dec 2000, Andrea Arcangeli wrote:
> On Thu, Dec 21, 2000 at 01:57:15AM +1100, Andrew Morton wrote:
> > If a task is on two waitqueues at the same time it becomes a bug:
> > if the outer waitqueue is non-exclusive and the inner is exclusive,
> 
> Your 2.2.x won't allow that either. You set the
> `current->task_exclusive = 1' and so you will get an exclusive
> wakeups in both waitqueues. You simply cannot tell register in
> two waitqueue and expect a non-exlusive wakeup in one and an
> exclusive wakeup in the other one.

Why not?  Having a wake-all wakeup on one event and a
wake-one wakeup on another kind of event seems perfectly
reasonable to me at a first glance. Is there something
I've overlooked ?

> > Anyway, it's academic.  davem would prefer that we do it properly
> > and move the `exclusive' flag into the waitqueue head to avoid the 
> > task-on-two-waitqueues problem, as was done in 2.4.  I think he's
> 
> The fact you could mix non-exclusive and exlusive wakeups in the
> same waitqueue was a feature not a misfeature. Then of course
> you cannot register in two waitqueues one with wake-one and one
> with wake-all but who does that anyways?

The "who does that anyways" argument could also be said about
mixing exclusive and non-exclusive wakeups in the same waitqueue.
Doing this seems rather confusing ... would you know any application
which needs this ?

> I think the real reason for spearating the two things as davem
> proposed is because otherwise we cannot register for a LIFO
> wake-one in O(1) as we needed for accept.

Do you have any reason to assume Davem and Linus lied about
why they changed the semantics? ;)  I'm pretty sure the reason
why Linus and Davem changed the semantics is the same reason
they mailed to this list ... ;)


I think it would be good to have the same semantics in 2.2 as
we have in 2.4. Using different semantics will probably only
lead to confusion. If there is a good reason to go with the
semantics Andrea proposed over the semantics Linus and Davem
have in 2.4, we should probably either change 2.4 or use the
2.4 semantics in 2.2 anyway just to avoid the confusion...

regards,

Rik
--
Hollywood goes for world dumbination,
Trailer at 11.

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com.br/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-20 Thread Andrea Arcangeli

On Thu, Dec 21, 2000 at 01:57:15AM +1100, Andrew Morton wrote:
> If a task is on two waitqueues at the same time it becomes a bug:
> if the outer waitqueue is non-exclusive and the inner is exclusive,

Your 2.2.x won't allow that either. You set the `current->task_exclusive = 1'
and so you will get an exclusive wakeups in both waitqueues. You simply cannot
tell register in two waitqueue and expect a non-exlusive wakeup in one and an
exclusive wakeup in the other one.

The only design difference (non implementation difference) between my patch and
your patch is that you have to clear task_exlusive explicitly. You
moved the EXCLUSIVE bitflag out of current->state field. That gives no
advantages and it looks ugiler to me. The robusteness point doesn't hold
IMHO: as soon as current->state is been changed by somebody else
you don't care anymore if it was exclusive wakeup or not.

About the fact I mask out the exlusive bit in schedule that's zero cost
compared to a cacheline miss, but it also depends if you have more wakeups or
schedules (with accept(2) usage there are going to be more schedule than
wakeups, but on other usages that could not be the case) but ok, the
performance point was nearly irrelevant ;).

> Anyway, it's academic.  davem would prefer that we do it properly
> and move the `exclusive' flag into the waitqueue head to avoid the 
> task-on-two-waitqueues problem, as was done in 2.4.  I think he's

The fact you could mix non-exclusive and exlusive wakeups in the same waitqueue
was a feature not a misfeature. Then of course you cannot register in two
waitqueues one with wake-one and one with wake-all but who does that anyways?
Definitely not an issue for 2.2.x.

I think the real reason for spearating the two things as davem proposed is
because otherwise we cannot register for a LIFO wake-one in O(1) as we needed
for accept.

Other thing about your patch, adding TASK_EXCLUSIVE to
wake_up/wake_up_interruptible is useless. You kind of mixed the two things at
the source level. In your patch TASK_EXCLUSIVE should not be defined.  Last
thing the wmb() in accept wasn't necessary. At that point you don't care at all
what the wakeup can see.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-20 Thread Andrew Morton

Andrea Arcangeli wrote:
> 
> > o wake_one semantics for accept() (Andrew Morton)
> 
> I dislike the implementation. I stick with my faster and nicer implementation
> that was just included in aa kernels for some time (2.2.18aa2 included it for
> example). Andrew, I guess you didn't noticed I just implemented the wakeone for
> accept. (I just ported it on top of 2.2.19pre2 after backing out the wakeone in
> pre2)

Yes, I noticed your implementation a few weeks back.

'fraid I never liked the use of the TASK_EXCLUSIVE bit in
task_struct.state.  It's an enumerated state, not an aggregation
of flags.  Most of the kernel treats it as an enumerated state
and so will happily clear the TASK_EXCLUSIVE bit without masking it
off.  Fragile.

If a task is on two waitqueues at the same time it becomes a bug:
if the outer waitqueue is non-exclusive and the inner is exclusive,
the task suddenly becomes exclusive on the outer one and converts
it from wake-all to wake-some!

Is it faster?  Not sure.  You've saved a cacheline read in __wake_up
(which was in fact a preload, if you look at what comes later) at the
cost of having to mask out the bit in current->state every time
we schedule().

Anyway, it's academic.  davem would prefer that we do it properly
and move the `exclusive' flag into the waitqueue head to avoid the 
task-on-two-waitqueues problem, as was done in 2.4.  I think he's
right.  Do you?

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-20 Thread Andrea Arcangeli

On Sat, Dec 16, 2000 at 07:11:47PM +, Alan Cox wrote:
> o E820 memory detect backport from 2.4(Michael Chen)

It's broken, it will crash machines:

   for (i = 0; i < e820.nr_map; i++) {
   unsigned long start, end;
   /* RAM? */
   if (e820.map[i].type != E820_RAM)
   continue;
   start = PFN_UP(e820.map[i].addr);
   end = PFN_DOWN(e820.map[i].addr + e820.map[i].size);
   if (start >= end)
   continue;
   if (end > max_pfn)
   max_pfn = end;
   }
   memory_end = (max_pfn << PAGE_SHIFT);

this will threat non-RAM holes as RAM. On 2.4.x we do a different things, that
is we collect the max_pfn but then we don't assume that there are no holes
between 1M and max_pfn ;), we instead fill the bootmem allocator _only_ with
E820_RAM segments.

I was in the process of fixing this (I also just backported the thinkpad
%edx clobber fix), but if somebody is going to work on this please let
me know so we stay in sync.

> o wake_one semantics for accept() (Andrew Morton)

I dislike the implementation. I stick with my faster and nicer implementation
that was just included in aa kernels for some time (2.2.18aa2 included it for
example). Andrew, I guess you didn't noticed I just implemented the wakeone for
accept. (I just ported it on top of 2.2.19pre2 after backing out the wakeone in
pre2)

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-20 Thread Kurt Garloff

On Wed, Dec 20, 2000 at 12:32:54PM +0200, Petri Kaukasoina wrote:
> OK, I booted 2.4.0-test12 which even prints that list:
> 
> BIOS-provided physical RAM map:
>  BIOS-e820: 0009fc00 @  (usable)
>  BIOS-e820: 0400 @ 0009fc00 (reserved)
>  BIOS-e820: 0348 @ 0010 (usable)
>  BIOS-e820: 0010 @ fec0 (reserved)
>  BIOS-e820: 0010 @ fee0 (reserved)
>  BIOS-e820: 0001 @  (reserved)
> Memory: 52232k/54784k available (831k kernel code, 2164k reserved, 62k data, 168k 
>init, 0k highmem)
> 
> The last three reserved lines correspond to the missing 2.5 Megs. What are
> they?

Data reserved by your system for whatever purpose.
Most probably ACPI data or similar.

> 2.2.18 sees all 56 Megs and works ok and after adding mem=56M on the
> kernel command line even 2.2.19pre2 works ok with all the 56 Megs. No
> crashes.

If you would have ACPI events, you would probably run into trouble.
Apart from this, chances are that the reserved data is not needed by Linux
and never accessed by the BIOS, so you may get away with using the reserved
memory.
The safe way is to respect the BIOS' RAM map.

Regards,
-- 
Kurt Garloff  <[EMAIL PROTECTED]>  Eindhoven, NL
GPG key: See mail header, key servers Linux kernel development
SuSE GmbH, Nuernberg, FRG   SCSI, Security

 PGP signature


Re: Linux 2.2.19pre2

2000-12-20 Thread Petri Kaukasoina

On Sun, Dec 17, 2000 at 04:38:02PM +0100, Kurt Garloff wrote:
> On Sun, Dec 17, 2000 at 12:56:56PM +0200, Petri Kaukasoina wrote:
> > I guess the new memory detect does not work correctly with my old work
> > horse. It is a 100 MHz pentium with 56 Megs RAM. AMIBIOS dated 10/10/94 with
> > a version number of 51-000-0001169_0011-101094-SIS550X-H.
> > 
> > 2.2.18 reports:
> > Memory: 55536k/57344k available (624k kernel code, 412k reserved, 732k data, 40k 
>init)
> > 
> > 2.2.19pre2 reports:
> > Memory: 53000k/54784k available (628k kernel code, 408k reserved, 708k data, 40k 
>init)
> > 
> > 57344k is 56 Megs which is correct.
> > 54784k is only 53.5 Megs.
> 
> It's this patch that changes things for you:
> o   E820 memory detect backport from 2.4(Michael Chen)
> 
> The E820 memory detection parses a list from the BIOS, which specifies
> the amount of memory, holes, reserved regions, ...
> Apparently, your BIOS does not do it completely correctly; otherwise you
> should have had crashes before ...

OK, I booted 2.4.0-test12 which even prints that list:

BIOS-provided physical RAM map:
 BIOS-e820: 0009fc00 @  (usable)
 BIOS-e820: 0400 @ 0009fc00 (reserved)
 BIOS-e820: 0348 @ 0010 (usable)
 BIOS-e820: 0010 @ fec0 (reserved)
 BIOS-e820: 0010 @ fee0 (reserved)
 BIOS-e820: 0001 @  (reserved)
Memory: 52232k/54784k available (831k kernel code, 2164k reserved, 62k data, 168k 
init, 0k highmem)

The last three reserved lines correspond to the missing 2.5 Megs. What are
they? 2.2.18 sees all 56 Megs and works ok and after adding mem=56M on the
kernel command line even 2.2.19pre2 works ok with all the 56 Megs. No
crashes.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-20 Thread Petri Kaukasoina

On Sun, Dec 17, 2000 at 04:38:02PM +0100, Kurt Garloff wrote:
 On Sun, Dec 17, 2000 at 12:56:56PM +0200, Petri Kaukasoina wrote:
  I guess the new memory detect does not work correctly with my old work
  horse. It is a 100 MHz pentium with 56 Megs RAM. AMIBIOS dated 10/10/94 with
  a version number of 51-000-0001169_0011-101094-SIS550X-H.
  
  2.2.18 reports:
  Memory: 55536k/57344k available (624k kernel code, 412k reserved, 732k data, 40k 
init)
  
  2.2.19pre2 reports:
  Memory: 53000k/54784k available (628k kernel code, 408k reserved, 708k data, 40k 
init)
  
  57344k is 56 Megs which is correct.
  54784k is only 53.5 Megs.
 
 It's this patch that changes things for you:
 o   E820 memory detect backport from 2.4(Michael Chen)
 
 The E820 memory detection parses a list from the BIOS, which specifies
 the amount of memory, holes, reserved regions, ...
 Apparently, your BIOS does not do it completely correctly; otherwise you
 should have had crashes before ...

OK, I booted 2.4.0-test12 which even prints that list:

BIOS-provided physical RAM map:
 BIOS-e820: 0009fc00 @  (usable)
 BIOS-e820: 0400 @ 0009fc00 (reserved)
 BIOS-e820: 0348 @ 0010 (usable)
 BIOS-e820: 0010 @ fec0 (reserved)
 BIOS-e820: 0010 @ fee0 (reserved)
 BIOS-e820: 0001 @  (reserved)
Memory: 52232k/54784k available (831k kernel code, 2164k reserved, 62k data, 168k 
init, 0k highmem)

The last three reserved lines correspond to the missing 2.5 Megs. What are
they? 2.2.18 sees all 56 Megs and works ok and after adding mem=56M on the
kernel command line even 2.2.19pre2 works ok with all the 56 Megs. No
crashes.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-20 Thread Kurt Garloff

On Wed, Dec 20, 2000 at 12:32:54PM +0200, Petri Kaukasoina wrote:
 OK, I booted 2.4.0-test12 which even prints that list:
 
 BIOS-provided physical RAM map:
  BIOS-e820: 0009fc00 @  (usable)
  BIOS-e820: 0400 @ 0009fc00 (reserved)
  BIOS-e820: 0348 @ 0010 (usable)
  BIOS-e820: 0010 @ fec0 (reserved)
  BIOS-e820: 0010 @ fee0 (reserved)
  BIOS-e820: 0001 @  (reserved)
 Memory: 52232k/54784k available (831k kernel code, 2164k reserved, 62k data, 168k 
init, 0k highmem)
 
 The last three reserved lines correspond to the missing 2.5 Megs. What are
 they?

Data reserved by your system for whatever purpose.
Most probably ACPI data or similar.

 2.2.18 sees all 56 Megs and works ok and after adding mem=56M on the
 kernel command line even 2.2.19pre2 works ok with all the 56 Megs. No
 crashes.

If you would have ACPI events, you would probably run into trouble.
Apart from this, chances are that the reserved data is not needed by Linux
and never accessed by the BIOS, so you may get away with using the reserved
memory.
The safe way is to respect the BIOS' RAM map.

Regards,
-- 
Kurt Garloff  [EMAIL PROTECTED]  Eindhoven, NL
GPG key: See mail header, key servers Linux kernel development
SuSE GmbH, Nuernberg, FRG   SCSI, Security

 PGP signature


Re: Linux 2.2.19pre2

2000-12-20 Thread Andrea Arcangeli

On Sat, Dec 16, 2000 at 07:11:47PM +, Alan Cox wrote:
 o E820 memory detect backport from 2.4(Michael Chen)

It's broken, it will crash machines:

   for (i = 0; i  e820.nr_map; i++) {
   unsigned long start, end;
   /* RAM? */
   if (e820.map[i].type != E820_RAM)
   continue;
   start = PFN_UP(e820.map[i].addr);
   end = PFN_DOWN(e820.map[i].addr + e820.map[i].size);
   if (start = end)
   continue;
   if (end  max_pfn)
   max_pfn = end;
   }
   memory_end = (max_pfn  PAGE_SHIFT);

this will threat non-RAM holes as RAM. On 2.4.x we do a different things, that
is we collect the max_pfn but then we don't assume that there are no holes
between 1M and max_pfn ;), we instead fill the bootmem allocator _only_ with
E820_RAM segments.

I was in the process of fixing this (I also just backported the thinkpad
%edx clobber fix), but if somebody is going to work on this please let
me know so we stay in sync.

 o wake_one semantics for accept() (Andrew Morton)

I dislike the implementation. I stick with my faster and nicer implementation
that was just included in aa kernels for some time (2.2.18aa2 included it for
example). Andrew, I guess you didn't noticed I just implemented the wakeone for
accept. (I just ported it on top of 2.2.19pre2 after backing out the wakeone in
pre2)

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-20 Thread Andrew Morton

Andrea Arcangeli wrote:
 
  o wake_one semantics for accept() (Andrew Morton)
 
 I dislike the implementation. I stick with my faster and nicer implementation
 that was just included in aa kernels for some time (2.2.18aa2 included it for
 example). Andrew, I guess you didn't noticed I just implemented the wakeone for
 accept. (I just ported it on top of 2.2.19pre2 after backing out the wakeone in
 pre2)

Yes, I noticed your implementation a few weeks back.

'fraid I never liked the use of the TASK_EXCLUSIVE bit in
task_struct.state.  It's an enumerated state, not an aggregation
of flags.  Most of the kernel treats it as an enumerated state
and so will happily clear the TASK_EXCLUSIVE bit without masking it
off.  Fragile.

If a task is on two waitqueues at the same time it becomes a bug:
if the outer waitqueue is non-exclusive and the inner is exclusive,
the task suddenly becomes exclusive on the outer one and converts
it from wake-all to wake-some!

Is it faster?  Not sure.  You've saved a cacheline read in __wake_up
(which was in fact a preload, if you look at what comes later) at the
cost of having to mask out the bit in current-state every time
we schedule().

Anyway, it's academic.  davem would prefer that we do it properly
and move the `exclusive' flag into the waitqueue head to avoid the 
task-on-two-waitqueues problem, as was done in 2.4.  I think he's
right.  Do you?

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-20 Thread Andrea Arcangeli

On Thu, Dec 21, 2000 at 01:57:15AM +1100, Andrew Morton wrote:
 If a task is on two waitqueues at the same time it becomes a bug:
 if the outer waitqueue is non-exclusive and the inner is exclusive,

Your 2.2.x won't allow that either. You set the `current-task_exclusive = 1'
and so you will get an exclusive wakeups in both waitqueues. You simply cannot
tell register in two waitqueue and expect a non-exlusive wakeup in one and an
exclusive wakeup in the other one.

The only design difference (non implementation difference) between my patch and
your patch is that you have to clear task_exlusive explicitly. You
moved the EXCLUSIVE bitflag out of current-state field. That gives no
advantages and it looks ugiler to me. The robusteness point doesn't hold
IMHO: as soon as current-state is been changed by somebody else
you don't care anymore if it was exclusive wakeup or not.

About the fact I mask out the exlusive bit in schedule that's zero cost
compared to a cacheline miss, but it also depends if you have more wakeups or
schedules (with accept(2) usage there are going to be more schedule than
wakeups, but on other usages that could not be the case) but ok, the
performance point was nearly irrelevant ;).

 Anyway, it's academic.  davem would prefer that we do it properly
 and move the `exclusive' flag into the waitqueue head to avoid the 
 task-on-two-waitqueues problem, as was done in 2.4.  I think he's

The fact you could mix non-exclusive and exlusive wakeups in the same waitqueue
was a feature not a misfeature. Then of course you cannot register in two
waitqueues one with wake-one and one with wake-all but who does that anyways?
Definitely not an issue for 2.2.x.

I think the real reason for spearating the two things as davem proposed is
because otherwise we cannot register for a LIFO wake-one in O(1) as we needed
for accept.

Other thing about your patch, adding TASK_EXCLUSIVE to
wake_up/wake_up_interruptible is useless. You kind of mixed the two things at
the source level. In your patch TASK_EXCLUSIVE should not be defined.  Last
thing the wmb() in accept wasn't necessary. At that point you don't care at all
what the wakeup can see.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-20 Thread Rik van Riel

On Wed, 20 Dec 2000, Andrea Arcangeli wrote:
 On Thu, Dec 21, 2000 at 01:57:15AM +1100, Andrew Morton wrote:
  If a task is on two waitqueues at the same time it becomes a bug:
  if the outer waitqueue is non-exclusive and the inner is exclusive,
 
 Your 2.2.x won't allow that either. You set the
 `current-task_exclusive = 1' and so you will get an exclusive
 wakeups in both waitqueues. You simply cannot tell register in
 two waitqueue and expect a non-exlusive wakeup in one and an
 exclusive wakeup in the other one.

Why not?  Having a wake-all wakeup on one event and a
wake-one wakeup on another kind of event seems perfectly
reasonable to me at a first glance. Is there something
I've overlooked ?

  Anyway, it's academic.  davem would prefer that we do it properly
  and move the `exclusive' flag into the waitqueue head to avoid the 
  task-on-two-waitqueues problem, as was done in 2.4.  I think he's
 
 The fact you could mix non-exclusive and exlusive wakeups in the
 same waitqueue was a feature not a misfeature. Then of course
 you cannot register in two waitqueues one with wake-one and one
 with wake-all but who does that anyways?

The "who does that anyways" argument could also be said about
mixing exclusive and non-exclusive wakeups in the same waitqueue.
Doing this seems rather confusing ... would you know any application
which needs this ?

 I think the real reason for spearating the two things as davem
 proposed is because otherwise we cannot register for a LIFO
 wake-one in O(1) as we needed for accept.

Do you have any reason to assume Davem and Linus lied about
why they changed the semantics? ;)  I'm pretty sure the reason
why Linus and Davem changed the semantics is the same reason
they mailed to this list ... ;)


I think it would be good to have the same semantics in 2.2 as
we have in 2.4. Using different semantics will probably only
lead to confusion. If there is a good reason to go with the
semantics Andrea proposed over the semantics Linus and Davem
have in 2.4, we should probably either change 2.4 or use the
2.4 semantics in 2.2 anyway just to avoid the confusion...

regards,

Rik
--
Hollywood goes for world dumbination,
Trailer at 11.

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com.br/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-20 Thread Andrea Arcangeli

On Wed, Dec 20, 2000 at 03:48:06PM -0200, Rik van Riel wrote:
 On Wed, 20 Dec 2000, Andrea Arcangeli wrote:
  On Thu, Dec 21, 2000 at 01:57:15AM +1100, Andrew Morton wrote:
   If a task is on two waitqueues at the same time it becomes a bug:
   if the outer waitqueue is non-exclusive and the inner is exclusive,
  
  Your 2.2.x won't allow that either. You set the
  `current-task_exclusive = 1' and so you will get an exclusive
  wakeups in both waitqueues. You simply cannot tell register in
  two waitqueue and expect a non-exlusive wakeup in one and an
  exclusive wakeup in the other one.
 
 Why not?  Having a wake-all wakeup on one event and a
 wake-one wakeup on another kind of event seems perfectly
 reasonable to me at a first glance. Is there something
 I've overlooked ?

I think you overlooked, never mind. I only said kernel 2.2.19pre2 won't allow
that either. I'm not saying it's impossible to implement or unrasonable.

   Anyway, it's academic.  davem would prefer that we do it properly
   and move the `exclusive' flag into the waitqueue head to avoid the 
   task-on-two-waitqueues problem, as was done in 2.4.  I think he's
  
  The fact you could mix non-exclusive and exlusive wakeups in the
  same waitqueue was a feature not a misfeature. Then of course
  you cannot register in two waitqueues one with wake-one and one
  with wake-all but who does that anyways?
 
 The "who does that anyways" argument could also be said about
 mixing exclusive and non-exclusive wakeups in the same waitqueue.
 Doing this seems rather confusing ... would you know any application
 which needs this ?

wake-one accept(2) in 2.2.x unless you want to rewrite part of the TCP stack to
still get select wake-all right (the reason they was mixed in whole 2.3.x was
the same reason we _need_ to somehow mix non-excusive and exlusive waiters in
the same waitqueue in 2.2.x to provide wake-one in accept).

And in 2.2.x the "who does that anyways" is much stronger, since _only_
accept is using the exclusive wakeup.

 I think it would be good to have the same semantics in 2.2 as
 we have in 2.4. [..]

2.3.0 born for allowing O(1) inserction in the waitqueue because only that
change generated and huge amount of details that was not possible to handle in
2.2.x.

 [..] If there is a good reason to go with the
 semantics Andrea proposed [..]

NOTE: I'm only talking about 2.2.19pre2, not 2.4.x. I didn't suggested anything
for 2.4.x and I'm totally fine with two different waitqueues. I even wanted to
differentiate them too in mid 2.3.x to fix accept that was FIFO but still
allowing insertion in the waitqueue in O(1), but didn't had the time to rework
the TCP stack and the rest of wake-one users.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-17 Thread Kurt Garloff

Hi,

On Sun, Dec 17, 2000 at 12:56:56PM +0200, Petri Kaukasoina wrote:
> I guess the new memory detect does not work correctly with my old work
> horse. It is a 100 MHz pentium with 56 Megs RAM. AMIBIOS dated 10/10/94 with
> a version number of 51-000-0001169_0011-101094-SIS550X-H.
> 
> 2.2.18 reports:
> Memory: 55536k/57344k available (624k kernel code, 412k reserved, 732k data, 40k 
>init)
> 
> 2.2.19pre2 reports:
> Memory: 53000k/54784k available (628k kernel code, 408k reserved, 708k data, 40k 
>init)
> 
> 57344k is 56 Megs which is correct.
> 54784k is only 53.5 Megs.

It's this patch that changes things for you:
o   E820 memory detect backport from 2.4(Michael Chen)

The E820 memory detection parses a list from the BIOS, which specifies
the amount of memory, holes, reserved regions, ...
Apparently, your BIOS does not do it completely correctly; otherwise you
should have had crashes before ...

Regards,
-- 
Kurt Garloff  <[EMAIL PROTECTED]>  Eindhoven, NL
GPG key: See mail header, key servers Linux kernel development
SuSE GmbH, Nuernberg, FRG   SCSI, Security

 PGP signature


Re: Linux 2.2.19pre2

2000-12-17 Thread Petri Kaukasoina

I guess the new memory detect does not work correctly with my old work
horse. It is a 100 MHz pentium with 56 Megs RAM. AMIBIOS dated 10/10/94 with
a version number of 51-000-0001169_0011-101094-SIS550X-H.

2.2.18 reports:
Memory: 55536k/57344k available (624k kernel code, 412k reserved, 732k data, 40k init)

2.2.19pre2 reports:
Memory: 53000k/54784k available (628k kernel code, 408k reserved, 708k data, 40k init)

57344k is 56 Megs which is correct.
54784k is only 53.5 Megs.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-17 Thread Petri Kaukasoina

I guess the new memory detect does not work correctly with my old work
horse. It is a 100 MHz pentium with 56 Megs RAM. AMIBIOS dated 10/10/94 with
a version number of 51-000-0001169_0011-101094-SIS550X-H.

2.2.18 reports:
Memory: 55536k/57344k available (624k kernel code, 412k reserved, 732k data, 40k init)

2.2.19pre2 reports:
Memory: 53000k/54784k available (628k kernel code, 408k reserved, 708k data, 40k init)

57344k is 56 Megs which is correct.
54784k is only 53.5 Megs.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.19pre2

2000-12-17 Thread Kurt Garloff

Hi,

On Sun, Dec 17, 2000 at 12:56:56PM +0200, Petri Kaukasoina wrote:
 I guess the new memory detect does not work correctly with my old work
 horse. It is a 100 MHz pentium with 56 Megs RAM. AMIBIOS dated 10/10/94 with
 a version number of 51-000-0001169_0011-101094-SIS550X-H.
 
 2.2.18 reports:
 Memory: 55536k/57344k available (624k kernel code, 412k reserved, 732k data, 40k 
init)
 
 2.2.19pre2 reports:
 Memory: 53000k/54784k available (628k kernel code, 408k reserved, 708k data, 40k 
init)
 
 57344k is 56 Megs which is correct.
 54784k is only 53.5 Megs.

It's this patch that changes things for you:
o   E820 memory detect backport from 2.4(Michael Chen)

The E820 memory detection parses a list from the BIOS, which specifies
the amount of memory, holes, reserved regions, ...
Apparently, your BIOS does not do it completely correctly; otherwise you
should have had crashes before ...

Regards,
-- 
Kurt Garloff  [EMAIL PROTECTED]  Eindhoven, NL
GPG key: See mail header, key servers Linux kernel development
SuSE GmbH, Nuernberg, FRG   SCSI, Security

 PGP signature