Re: proposed smp_rendezvous change

2011-07-18 Thread Andriy Gapon
on 15/05/2011 19:24 Andriy Gapon said the following: on 15/05/2011 19:09 Max Laier said the following: I don't think we ever intended to synchronize the local teardown part, and I believe that is the correct behavior for this API. This version is sufficiently close to what I have, so I am

Re: proposed smp_rendezvous change

2011-07-18 Thread Andriy Gapon
on 17/05/2011 18:51 John Baldwin said the following: On Tuesday, May 17, 2011 10:34:41 am Andriy Gapon wrote: on 17/05/2011 16:58 John Baldwin said the following: No, it doesn't quite work that way. It wouldn't work on Alpha for example. All load_acq is a load with a memory barrier to order

Re: proposed smp_rendezvous change

2011-07-18 Thread John Baldwin
On Monday, July 18, 2011 7:34:11 am Andriy Gapon wrote: on 15/05/2011 19:24 Andriy Gapon said the following: on 15/05/2011 19:09 Max Laier said the following: I don't think we ever intended to synchronize the local teardown part, and I believe that is the correct behavior for this

Re: proposed smp_rendezvous change

2011-05-19 Thread John Baldwin
On Wednesday, May 18, 2011 4:27:50 pm Max Laier wrote: On 05/17/2011 01:35 PM, John Baldwin wrote: ... Yeah, I already have a patch to do that, but hadn't added atomic ops to critical_enter() and critical_exit(). But it also wasn't as fancy in the critical_exit() case. Here is what I

Re: proposed smp_rendezvous change

2011-05-19 Thread John Baldwin
On Wednesday, May 18, 2011 4:27:50 pm Max Laier wrote: On 05/17/2011 01:35 PM, John Baldwin wrote: ... Yeah, I already have a patch to do that, but hadn't added atomic ops to critical_enter() and critical_exit(). But it also wasn't as fancy in the critical_exit() case. Here is what I

Re: proposed smp_rendezvous change

2011-05-18 Thread Max Laier
On 05/17/2011 01:35 PM, John Baldwin wrote: ... Yeah, I already have a patch to do that, but hadn't added atomic ops to critical_enter() and critical_exit(). But it also wasn't as fancy in the critical_exit() case. Here is what I have and I think it might actually be ok (it doesn't use an

Re: proposed smp_rendezvous change

2011-05-17 Thread Andriy Gapon
on 16/05/2011 23:09 John Baldwin said the following: On Monday, May 16, 2011 3:27:47 pm Andriy Gapon wrote: on 16/05/2011 21:21 John Baldwin said the following: How about this: ... /* * Shared mutex to restrict busywaits between smp_rendezvous() and @@ -311,39 +312,62 @@

Re: proposed smp_rendezvous change

2011-05-17 Thread John Baldwin
On 5/17/11 4:03 AM, Andriy Gapon wrote: on 16/05/2011 23:09 John Baldwin said the following: is probably just cut and pasted to match the other uses of values in the smp_rv_waiters[] array. (atomic_add_acq_int() could spin on architectures where it is implemented using compare-and-swap (e.g.

Re: proposed smp_rendezvous change

2011-05-17 Thread John Baldwin
On 5/16/11 6:05 PM, Max Laier wrote: On Monday 16 May 2011 17:54:54 Attilio Rao wrote: 2011/5/16 Max Laierm...@love2party.net: On Monday 16 May 2011 16:46:03 John Baldwin wrote: On Monday, May 16, 2011 4:30:44 pm Max Laier wrote: On Monday 16 May 2011 14:21:27 John Baldwin wrote: Yes, we

Re: proposed smp_rendezvous change

2011-05-17 Thread Andriy Gapon
on 17/05/2011 14:56 John Baldwin said the following: On 5/17/11 4:03 AM, Andriy Gapon wrote: Couldn't [Shouldn't] the whole: /* Ensure we have up-to-date values. */ atomic_add_acq_int(smp_rv_waiters[0], 1); while (smp_rv_waiters[0] smp_rv_ncpus) cpu_spinwait();

Re: proposed smp_rendezvous change

2011-05-17 Thread John Baldwin
On Tuesday, May 17, 2011 9:46:34 am Andriy Gapon wrote: on 17/05/2011 14:56 John Baldwin said the following: On 5/17/11 4:03 AM, Andriy Gapon wrote: Couldn't [Shouldn't] the whole: /* Ensure we have up-to-date values. */ atomic_add_acq_int(smp_rv_waiters[0], 1);

Re: proposed smp_rendezvous change

2011-05-17 Thread Andriy Gapon
on 17/05/2011 16:58 John Baldwin said the following: No, it doesn't quite work that way. It wouldn't work on Alpha for example. All load_acq is a load with a memory barrier to order other loads after it. It is still free to load stale data. Only a read-modify-write operation would actually

Re: proposed smp_rendezvous change

2011-05-17 Thread Max Laier
On 05/17/2011 05:16 AM, John Baldwin wrote: ... Index: kern/kern_switch.c === --- kern/kern_switch.c (revision 221536) +++ kern/kern_switch.c (working copy) @@ -192,15 +192,22 @@ critical_exit(void) { struct thread *td; - int flags;

Re: proposed smp_rendezvous change

2011-05-17 Thread John Baldwin
On Tuesday, May 17, 2011 10:34:41 am Andriy Gapon wrote: on 17/05/2011 16:58 John Baldwin said the following: No, it doesn't quite work that way. It wouldn't work on Alpha for example. All load_acq is a load with a memory barrier to order other loads after it. It is still free to load

Re: proposed smp_rendezvous change

2011-05-17 Thread John Baldwin
On Tuesday, May 17, 2011 12:20:40 pm Max Laier wrote: On 05/17/2011 05:16 AM, John Baldwin wrote: ... Index: kern/kern_switch.c === --- kern/kern_switch.c (revision 221536) +++ kern/kern_switch.c (working copy) @@ -192,15

Re: proposed smp_rendezvous change

2011-05-17 Thread Max Laier
On 05/17/2011 09:56 AM, John Baldwin wrote: On Tuesday, May 17, 2011 12:20:40 pm Max Laier wrote: On 05/17/2011 05:16 AM, John Baldwin wrote: ... Index: kern/kern_switch.c === --- kern/kern_switch.c (revision 221536) +++

Re: proposed smp_rendezvous change

2011-05-17 Thread Andriy Gapon
on 17/05/2011 18:51 John Baldwin said the following: On Tuesday, May 17, 2011 10:34:41 am Andriy Gapon wrote: on 17/05/2011 16:58 John Baldwin said the following: No, it doesn't quite work that way. It wouldn't work on Alpha for example. All load_acq is a load with a memory barrier to order

Re: proposed smp_rendezvous change

2011-05-17 Thread John Baldwin
On Tuesday, May 17, 2011 3:16:45 pm Max Laier wrote: On 05/17/2011 09:56 AM, John Baldwin wrote: On Tuesday, May 17, 2011 12:20:40 pm Max Laier wrote: On 05/17/2011 05:16 AM, John Baldwin wrote: ... Index: kern/kern_switch.c

Re: proposed smp_rendezvous change

2011-05-16 Thread John Baldwin
On Sunday, May 15, 2011 12:36:45 pm Andriy Gapon wrote: on 15/05/2011 18:16 John Baldwin said the following: On 5/15/11 10:53 AM, Andriy Gapon wrote: on 15/05/2011 10:12 Andriy Gapon said the following: on 14/05/2011 18:25 John Baldwin said the following: Hmmm, so this is not actually

Re: proposed smp_rendezvous change

2011-05-16 Thread Max Laier
I'd like to propose that we move forward with fixing the race, before redesigning the API - please. We agree that there is a problem with the exit rendezvous. Let's fix that. We agree that the generation approach proposed by NetAPP is the right way to go? Can we check it in, then? Again,

Re: proposed smp_rendezvous change

2011-05-16 Thread John Baldwin
On Monday, May 16, 2011 1:46:33 pm Max Laier wrote: I'd like to propose that we move forward with fixing the race, before redesigning the API - please. We agree that there is a problem with the exit rendezvous. Let's fix that. We agree that the generation approach proposed by NetAPP is

Re: proposed smp_rendezvous change

2011-05-16 Thread Andriy Gapon
on 16/05/2011 21:21 John Baldwin said the following: How about this: ... /* * Shared mutex to restrict busywaits between smp_rendezvous() and @@ -311,39 +312,62 @@ restart_cpus(cpumask_t map) void smp_rendezvous_action(void) { - void* local_func_arg = smp_rv_func_arg; -

Re: proposed smp_rendezvous change

2011-05-16 Thread Max Laier
On Monday 16 May 2011 14:21:27 John Baldwin wrote: On Monday, May 16, 2011 1:46:33 pm Max Laier wrote: I'd like to propose that we move forward with fixing the race, before redesigning the API - please. We agree that there is a problem with the exit rendezvous. Let's fix that. We

Re: proposed smp_rendezvous change

2011-05-16 Thread John Baldwin
On Monday, May 16, 2011 3:27:47 pm Andriy Gapon wrote: on 16/05/2011 21:21 John Baldwin said the following: How about this: ... /* * Shared mutex to restrict busywaits between smp_rendezvous() and @@ -311,39 +312,62 @@ restart_cpus(cpumask_t map) void

Re: proposed smp_rendezvous change

2011-05-16 Thread John Baldwin
On Monday, May 16, 2011 4:30:44 pm Max Laier wrote: On Monday 16 May 2011 14:21:27 John Baldwin wrote: Yes, we need to fix that. Humm, it doesn't preempt when you do a critical_exit() though? Or do you use a hand-rolled critical exit that doesn't do a deferred preemption? Right now I

Re: proposed smp_rendezvous change

2011-05-16 Thread Max Laier
On Monday 16 May 2011 16:46:03 John Baldwin wrote: On Monday, May 16, 2011 4:30:44 pm Max Laier wrote: On Monday 16 May 2011 14:21:27 John Baldwin wrote: Yes, we need to fix that. Humm, it doesn't preempt when you do a critical_exit() though? Or do you use a hand-rolled critical exit

Re: proposed smp_rendezvous change

2011-05-16 Thread Max Laier
On Monday 16 May 2011 17:54:54 Attilio Rao wrote: 2011/5/16 Max Laier m...@love2party.net: On Monday 16 May 2011 16:46:03 John Baldwin wrote: On Monday, May 16, 2011 4:30:44 pm Max Laier wrote: On Monday 16 May 2011 14:21:27 John Baldwin wrote: Yes, we need to fix that. Humm, it

Re: proposed smp_rendezvous change

2011-05-16 Thread Attilio Rao
2011/5/17 Max Laier m...@love2party.net: On Monday 16 May 2011 17:54:54 Attilio Rao wrote: 2011/5/16 Max Laier m...@love2party.net: On Monday 16 May 2011 16:46:03 John Baldwin wrote: On Monday, May 16, 2011 4:30:44 pm Max Laier wrote: On Monday 16 May 2011 14:21:27 John Baldwin wrote:

Re: proposed smp_rendezvous change

2011-05-16 Thread Attilio Rao
2011/5/16 Max Laier m...@love2party.net: On Monday 16 May 2011 16:46:03 John Baldwin wrote: On Monday, May 16, 2011 4:30:44 pm Max Laier wrote: On Monday 16 May 2011 14:21:27 John Baldwin wrote: Yes, we need to fix that.  Humm, it doesn't preempt when you do a critical_exit() though?  Or

Re: proposed smp_rendezvous change

2011-05-15 Thread Andriy Gapon
on 14/05/2011 18:25 John Baldwin said the following: On 5/13/11 9:43 AM, Andriy Gapon wrote: This is a change in vein of what I've been doing in the xcpu branch and it's supposed to fix the issue by the recent commit that (probably unintentionally) stress-tests smp_rendezvous in TSC code.

Re: proposed smp_rendezvous change

2011-05-15 Thread Andriy Gapon
on 14/05/2011 18:25 John Baldwin said the following: Hmmm, so this is not actually sufficient. NetApp ran into a very similar race with virtual CPUs in BHyVe. In their case because virtual CPUs are threads that can be preempted, they have a chance at a longer race. The problem that they

Re: proposed smp_rendezvous change

2011-05-15 Thread Andriy Gapon
on 15/05/2011 07:33 Max Laier said the following: On Saturday 14 May 2011 11:25:36 John Baldwin wrote: On 5/13/11 9:43 AM, Andriy Gapon wrote: This is a change in vein of what I've been doing in the xcpu branch and it's supposed to fix the issue by the recent commit that (probably

Re: proposed smp_rendezvous change

2011-05-15 Thread Andriy Gapon
on 15/05/2011 10:12 Andriy Gapon said the following: on 14/05/2011 18:25 John Baldwin said the following: Hmmm, so this is not actually sufficient. NetApp ran into a very similar race with virtual CPUs in BHyVe. In their case because virtual CPUs are threads that can be preempted, they

Re: proposed smp_rendezvous change

2011-05-15 Thread John Baldwin
On 5/15/11 10:53 AM, Andriy Gapon wrote: on 15/05/2011 10:12 Andriy Gapon said the following: on 14/05/2011 18:25 John Baldwin said the following: Hmmm, so this is not actually sufficient. NetApp ran into a very similar race with virtual CPUs in BHyVe. In their case because virtual CPUs are

Re: proposed smp_rendezvous change

2011-05-15 Thread Max Laier
On Sunday 15 May 2011 11:16:03 John Baldwin wrote: On 5/15/11 10:53 AM, Andriy Gapon wrote: on 15/05/2011 10:12 Andriy Gapon said the following: on 14/05/2011 18:25 John Baldwin said the following: Hmmm, so this is not actually sufficient. NetApp ran into a very similar race with virtual

Re: proposed smp_rendezvous change

2011-05-15 Thread Andriy Gapon
on 15/05/2011 19:09 Max Laier said the following: I don't think we ever intended to synchronize the local teardown part, and I believe that is the correct behavior for this API. This version is sufficiently close to what I have, so I am resonably sure that it will work for us. It

Re: proposed smp_rendezvous change

2011-05-15 Thread Andriy Gapon
on 15/05/2011 18:16 John Baldwin said the following: On 5/15/11 10:53 AM, Andriy Gapon wrote: on 15/05/2011 10:12 Andriy Gapon said the following: on 14/05/2011 18:25 John Baldwin said the following: Hmmm, so this is not actually sufficient. NetApp ran into a very similar race with virtual

Re: proposed smp_rendezvous change

2011-05-14 Thread John Baldwin
On 5/13/11 9:43 AM, Andriy Gapon wrote: This is a change in vein of what I've been doing in the xcpu branch and it's supposed to fix the issue by the recent commit that (probably unintentionally) stress-tests smp_rendezvous in TSC code. Non-essential changes: - ditch initial, and in my opinion

Re: proposed smp_rendezvous change

2011-05-14 Thread Max Laier
On Saturday 14 May 2011 11:25:36 John Baldwin wrote: On 5/13/11 9:43 AM, Andriy Gapon wrote: This is a change in vein of what I've been doing in the xcpu branch and it's supposed to fix the issue by the recent commit that (probably unintentionally) stress-tests smp_rendezvous in TSC code.

proposed smp_rendezvous change

2011-05-13 Thread Andriy Gapon
This is a change in vein of what I've been doing in the xcpu branch and it's supposed to fix the issue by the recent commit that (probably unintentionally) stress-tests smp_rendezvous in TSC code. Non-essential changes: - ditch initial, and in my opinion useless, pre-setup rendezvous in

Re: proposed smp_rendezvous change

2011-05-13 Thread Max Laier
On Friday 13 May 2011 09:43:25 Andriy Gapon wrote: This is a change in vein of what I've been doing in the xcpu branch and it's supposed to fix the issue by the recent commit that (probably unintentionally) stress-tests smp_rendezvous in TSC code. Non-essential changes: - ditch initial, and

Re: proposed smp_rendezvous change

2011-05-13 Thread Max Laier
On Friday 13 May 2011 09:43:25 Andriy Gapon wrote: This is a change in vein of what I've been doing in the xcpu branch and it's supposed to fix the issue by the recent commit that (probably unintentionally) stress-tests smp_rendezvous in TSC code. Non-essential changes: - ditch initial, and

Re: proposed smp_rendezvous change

2011-05-13 Thread Andriy Gapon
on 13/05/2011 17:41 Max Laier said the following: this ncpus isn't the one you are looking for. Thank you! Here's an updated patch: Index: sys/kern/subr_smp.c === --- sys/kern/subr_smp.c (revision 221835) +++ sys/kern/subr_smp.c

Re: proposed smp_rendezvous change

2011-05-13 Thread Max Laier
On Friday 13 May 2011 11:28:33 Andriy Gapon wrote: on 13/05/2011 17:41 Max Laier said the following: this ncpus isn't the one you are looking for. Thank you! Here's an updated patch: Can you attach the patch, so I can apply it locally. This code is really hard to read without context.

Re: proposed smp_rendezvous change

2011-05-13 Thread Max Laier
On Friday 13 May 2011 11:50:57 Max Laier wrote: On Friday 13 May 2011 11:28:33 Andriy Gapon wrote: on 13/05/2011 17:41 Max Laier said the following: this ncpus isn't the one you are looking for. Thank you! Here's an updated patch: Can you attach the patch, so I can apply it

Re: proposed smp_rendezvous change

2011-05-13 Thread Andriy Gapon
on 13/05/2011 18:50 Max Laier said the following: On Friday 13 May 2011 11:28:33 Andriy Gapon wrote: on 13/05/2011 17:41 Max Laier said the following: this ncpus isn't the one you are looking for. Thank you! Here's an updated patch: Can you attach the patch, so I can apply it locally.

Re: proposed smp_rendezvous change

2011-05-13 Thread Andriy Gapon
on 13/05/2011 20:13 Max Laier said the following: Disregard this ... I misread the diff. You are indeed using [2] correctly as the all-clear semaphore. I still believe, that it is safer/cleaner to do this spin before releasing the lock instead (see my patch). Maybe. I consider my approach