Re: ARM pcpu changes, was Re: [patch] i386 pmap sysmaps_pcpu[] atomic access
On Tue, Feb 26, 2013 at 1:39 PM, Konstantin Belousov kostik...@gmail.com wrote: On Tue, Feb 26, 2013 at 09:06:51PM +1300, Andrew Turner wrote: Does anyone know if it is only curthread that needs to be atomic? If so this should work. Reading the cpuid from the system currently is a single instruction, however it appears the code will need to be reworked for use with multiple levels of affinity, for example when there are clusters of cores the current code will number two cores on different clusters with the same id. I don't think we need to use ldrex/strex to read/write the values, my understanding is the rest should be safe to access without atomic functions. I read about ARM architecture some time ago. What I am saying below is not a proposal, but rather a way for me to learn more about the architecture. From my reading of the docs, ARM has the shadow registers, in particular, the r13 (stack pointer) is shadowed for all privileged modes. Is the shadow used by our kernel, is it correctly restored on the context switch ? If yes, you can easily recover the pcb address from the current sp, without accessing any coprocessor registers, and even without any assignment of the global register for curthread (which needs to be done at the kernel entry). Just align the stack start on the 2*PAGE_SIZE boundary (like it is already done for MIPS), and do the mask operation on the sp value to get the end of pcb. This is atomic and context-switch safe. pcb us already per-thread, and can store the thread pointer. More, you can store the curcpu or cpuid pointer into pcb too, and update it on the context switch. amd64 has an architecturally-defined special register (k)gsbase, which effectively must be correctly initialized at each kernel entry, and restored on the return to usermode. Since the initialization on entry and restoration on exit is not atomic wrt the entry/exit itself, amd64 periodically gets a bugs which cause kernel running with user gsbase. This is the fatal bug, destroying the kernel structures and allowing the CPU privilege mode switch for normal user. Using the shadow registers for this purpose eliminate the whole source of the bugs. Well, IMHO, the both kernel and user thread stacks must correctly be set for current threads, otherwise it doesn't work. So, the idea to save things on a thread stack and update them on a context switch will work in general. However, the stack must be aligned to its size. And this is the price. I have no idea how this kernel stack alignment can impact kernel virtual space fragmentation. OTOH, as you say, this kernel stack storage can be used for many things. Svata ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: ARM pcpu changes, was Re: [patch] i386 pmap sysmaps_pcpu[] atomic access
On Fri, Mar 01, 2013 at 02:20:16PM +0100, Svatopluk Kraus wrote: On Tue, Feb 26, 2013 at 1:39 PM, Konstantin Belousov kostik...@gmail.com wrote: On Tue, Feb 26, 2013 at 09:06:51PM +1300, Andrew Turner wrote: Does anyone know if it is only curthread that needs to be atomic? If so this should work. Reading the cpuid from the system currently is a single instruction, however it appears the code will need to be reworked for use with multiple levels of affinity, for example when there are clusters of cores the current code will number two cores on different clusters with the same id. I don't think we need to use ldrex/strex to read/write the values, my understanding is the rest should be safe to access without atomic functions. I read about ARM architecture some time ago. What I am saying below is not a proposal, but rather a way for me to learn more about the architecture. From my reading of the docs, ARM has the shadow registers, in particular, the r13 (stack pointer) is shadowed for all privileged modes. Is the shadow used by our kernel, is it correctly restored on the context switch ? If yes, you can easily recover the pcb address from the current sp, without accessing any coprocessor registers, and even without any assignment of the global register for curthread (which needs to be done at the kernel entry). Just align the stack start on the 2*PAGE_SIZE boundary (like it is already done for MIPS), and do the mask operation on the sp value to get the end of pcb. This is atomic and context-switch safe. pcb us already per-thread, and can store the thread pointer. More, you can store the curcpu or cpuid pointer into pcb too, and update it on the context switch. amd64 has an architecturally-defined special register (k)gsbase, which effectively must be correctly initialized at each kernel entry, and restored on the return to usermode. Since the initialization on entry and restoration on exit is not atomic wrt the entry/exit itself, amd64 periodically gets a bugs which cause kernel running with user gsbase. This is the fatal bug, destroying the kernel structures and allowing the CPU privilege mode switch for normal user. Using the shadow registers for this purpose eliminate the whole source of the bugs. Well, IMHO, the both kernel and user thread stacks must correctly be set for current threads, otherwise it doesn't work. So, the idea to save things on a thread stack and update them on a context switch will work in general. However, the stack must be aligned to its size. And this is the price. I have no idea how this kernel stack alignment can impact kernel virtual space fragmentation. OTOH, as you say, this kernel stack storage can be used for many things. Hi, FWIW, I have a proof-of-ooncept patch that does just that, and it seems to work well. Regards, Olivier ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
ARM pcpu changes, was Re: [patch] i386 pmap sysmaps_pcpu[] atomic access
On Mon, 25 Feb 2013 13:47:47 +0100 Olivier Houchard cog...@ci0.org wrote: On Mon, Feb 25, 2013 at 07:09:20PM +1300, Andrew Turner wrote: On Thu, 21 Feb 2013 10:43:49 -0500 John Baldwin j...@freebsd.org wrote: On Thursday, February 21, 2013 7:53:52 am Svatopluk Kraus wrote: On Wed, Feb 20, 2013 at 4:22 PM, John Baldwin j...@freebsd.org wrote: On Wednesday, February 20, 2013 7:31:08 am Svatopluk Kraus wrote: On Tue, Feb 19, 2013 at 7:51 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Feb 18, 2013 at 11:18:16PM +0100, Svatopluk Kraus wrote: On Mon, Feb 18, 2013 at 9:36 PM, Konstantin Belousov kostik...@gmail.com wrote: Well, I'm taking a part on porting FreeBSD to ARM11mpcore. UP case was simple. SMP case is more complex and rather new for me. Recently, I was solving a problem with PCPU stuff. For example, PCPU_GET is implemented by one instruction on i386 arch. So, a need of atomicity with respect to interrupts can be overlooked. On load-store archs, the implementation which works in SMP case is not so simple. And what works in UP case (single PCPU), not works in SMP case. Believe me, mysterious and sporadic 'mutex not owned' assertions and others ones caused by curthreads mess, it takes a while ... Note that PCPU_GET() is not needed to be atomic. The reason is that the code which uses its result would not be atomic (single-instruction or whatever, see below). Thus, either the preemption should not matter since the action with the per-cpu data is advisory, as is in the case of i386 pmap you noted. or some external measures should be applied in advance to the containing region (which you proposed, by bracing with sched_pin()). So, it's advisory in the case of i386 pmap... Well, if you can live with that, I can too. Also, note that it is not interrupts but preemption which is concern. Yes and no. In theory, yes, a preemption is a concern. In FreeBSD, however, sched_pin() and critical_enter() and their counterparts are implemented with help of curthread. And curthread definition falls to PCPU_GET(curthread) if not defined in other way. So, curthread should be atomic with respect to interrupts and in general, PCPU_GET() should be too. Note that spinlock_enter() definitions often (always) use curthread too. Anyhow, it's defined in MD code and can be defined for each arch separately. curthread is a bit magic. :) If you perform a context switch during an interrupt (which will change 'curthread') you also change your register state. When you resume, the register state is also restored. This means that while something like 'PCPU_GET(cpuid)' might be stale after you read it, 'curthread' never is. However, it is true that actually reading curthread has to be atomic. If your read of curthread looks like: mov pcpu reg, r0 add offset of pc_curthread, r0 ld r0, r1 Then that will indeed break. Alpha used a fixed register for 'pcpu_reg' (as does ia64 IIRC). OTOH, you might also be able to depend on the fact that pc_curthread is the first thing in 'struct pcpu' (and always will be, you could add a CTASSERT to future-proof). In that case, you can remove the 'add' instruction and instead just do: ld pcpu reg, r1 which is fine. Just for the record. There are three extra (coprocessor) registers per a core in arm11mpcore (armv6k). Unfortunately only one is Privileged. The other ones are respectively User Read only and User Read Write. For now, we are using the Privileged one to store pcpu pointer (curthread is correctly set during each context switch). Thus, using a coprocessor register means that reading of curthread is not a single instruction implementation now. After we figured out the curthread issue in SMP case, using a fixed (processor) register for pcpu is an option. Meanwhile, we disable interrupts before reading of curthread and enable them after. The same is done for other PCPU stuff. For now we have not stable system enough for profiling, however, when it will be, it would be interesting to learn how various implementations of curthread reading impact system performance. curthread is read _a lot_, so I would expect this to hurt. What we did on Alpha was to use a fixed register for pcpu access, but we used the equivalent of a coprocessor register to also store the value so we could set that fixed register on entry to the kernel (userland was free to use the register for its own purposes). The current code on ARM is not atomic, it loads the base address of the pcpu data from the coprocessor then
Re: ARM pcpu changes, was Re: [patch] i386 pmap sysmaps_pcpu[] atomic access
On Tue, Feb 26, 2013 at 09:06:51PM +1300, Andrew Turner wrote: Does anyone know if it is only curthread that needs to be atomic? If so this should work. Reading the cpuid from the system currently is a single instruction, however it appears the code will need to be reworked for use with multiple levels of affinity, for example when there are clusters of cores the current code will number two cores on different clusters with the same id. I don't think we need to use ldrex/strex to read/write the values, my understanding is the rest should be safe to access without atomic functions. I read about ARM architecture some time ago. What I am saying below is not a proposal, but rather a way for me to learn more about the architecture. From my reading of the docs, ARM has the shadow registers, in particular, the r13 (stack pointer) is shadowed for all privileged modes. Is the shadow used by our kernel, is it correctly restored on the context switch ? If yes, you can easily recover the pcb address from the current sp, without accessing any coprocessor registers, and even without any assignment of the global register for curthread (which needs to be done at the kernel entry). Just align the stack start on the 2*PAGE_SIZE boundary (like it is already done for MIPS), and do the mask operation on the sp value to get the end of pcb. This is atomic and context-switch safe. pcb us already per-thread, and can store the thread pointer. More, you can store the curcpu or cpuid pointer into pcb too, and update it on the context switch. amd64 has an architecturally-defined special register (k)gsbase, which effectively must be correctly initialized at each kernel entry, and restored on the return to usermode. Since the initialization on entry and restoration on exit is not atomic wrt the entry/exit itself, amd64 periodically gets a bugs which cause kernel running with user gsbase. This is the fatal bug, destroying the kernel structures and allowing the CPU privilege mode switch for normal user. Using the shadow registers for this purpose eliminate the whole source of the bugs. pgpmYPFAUYHxg.pgp Description: PGP signature
Re: [patch] i386 pmap sysmaps_pcpu[] atomic access
On Monday, February 25, 2013 7:47:47 am Olivier Houchard wrote: On Mon, Feb 25, 2013 at 07:09:20PM +1300, Andrew Turner wrote: On Thu, 21 Feb 2013 10:43:49 -0500 John Baldwin j...@freebsd.org wrote: On Thursday, February 21, 2013 7:53:52 am Svatopluk Kraus wrote: On Wed, Feb 20, 2013 at 4:22 PM, John Baldwin j...@freebsd.org wrote: On Wednesday, February 20, 2013 7:31:08 am Svatopluk Kraus wrote: On Tue, Feb 19, 2013 at 7:51 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Feb 18, 2013 at 11:18:16PM +0100, Svatopluk Kraus wrote: On Mon, Feb 18, 2013 at 9:36 PM, Konstantin Belousov kostik...@gmail.com wrote: Well, I'm taking a part on porting FreeBSD to ARM11mpcore. UP case was simple. SMP case is more complex and rather new for me. Recently, I was solving a problem with PCPU stuff. For example, PCPU_GET is implemented by one instruction on i386 arch. So, a need of atomicity with respect to interrupts can be overlooked. On load-store archs, the implementation which works in SMP case is not so simple. And what works in UP case (single PCPU), not works in SMP case. Believe me, mysterious and sporadic 'mutex not owned' assertions and others ones caused by curthreads mess, it takes a while ... Note that PCPU_GET() is not needed to be atomic. The reason is that the code which uses its result would not be atomic (single-instruction or whatever, see below). Thus, either the preemption should not matter since the action with the per-cpu data is advisory, as is in the case of i386 pmap you noted. or some external measures should be applied in advance to the containing region (which you proposed, by bracing with sched_pin()). So, it's advisory in the case of i386 pmap... Well, if you can live with that, I can too. Also, note that it is not interrupts but preemption which is concern. Yes and no. In theory, yes, a preemption is a concern. In FreeBSD, however, sched_pin() and critical_enter() and their counterparts are implemented with help of curthread. And curthread definition falls to PCPU_GET(curthread) if not defined in other way. So, curthread should be atomic with respect to interrupts and in general, PCPU_GET() should be too. Note that spinlock_enter() definitions often (always) use curthread too. Anyhow, it's defined in MD code and can be defined for each arch separately. curthread is a bit magic. :) If you perform a context switch during an interrupt (which will change 'curthread') you also change your register state. When you resume, the register state is also restored. This means that while something like 'PCPU_GET(cpuid)' might be stale after you read it, 'curthread' never is. However, it is true that actually reading curthread has to be atomic. If your read of curthread looks like: mov pcpu reg, r0 add offset of pc_curthread, r0 ld r0, r1 Then that will indeed break. Alpha used a fixed register for 'pcpu_reg' (as does ia64 IIRC). OTOH, you might also be able to depend on the fact that pc_curthread is the first thing in 'struct pcpu' (and always will be, you could add a CTASSERT to future-proof). In that case, you can remove the 'add' instruction and instead just do: ld pcpu reg, r1 which is fine. Just for the record. There are three extra (coprocessor) registers per a core in arm11mpcore (armv6k). Unfortunately only one is Privileged. The other ones are respectively User Read only and User Read Write. For now, we are using the Privileged one to store pcpu pointer (curthread is correctly set during each context switch). Thus, using a coprocessor register means that reading of curthread is not a single instruction implementation now. After we figured out the curthread issue in SMP case, using a fixed (processor) register for pcpu is an option. Meanwhile, we disable interrupts before reading of curthread and enable them after. The same is done for other PCPU stuff. For now we have not stable system enough for profiling, however, when it will be, it would be interesting to learn how various implementations of curthread reading impact system performance. curthread is read _a lot_, so I would expect this to hurt. What we did on Alpha was to use a fixed register for pcpu access, but we used the equivalent of a coprocessor register to also store the value so we could set that fixed register on entry to the kernel (userland was free to use the register for its own purposes). The current code on ARM is not atomic, it loads the base address of the pcpu data from the coprocessor then loads curthread from this
Re: [patch] i386 pmap sysmaps_pcpu[] atomic access
On Mon, Feb 25, 2013 at 04:26:14PM -0500, John Baldwin wrote: On Monday, February 25, 2013 7:47:47 am Olivier Houchard wrote: On Mon, Feb 25, 2013 at 07:09:20PM +1300, Andrew Turner wrote: On Thu, 21 Feb 2013 10:43:49 -0500 John Baldwin j...@freebsd.org wrote: On Thursday, February 21, 2013 7:53:52 am Svatopluk Kraus wrote: On Wed, Feb 20, 2013 at 4:22 PM, John Baldwin j...@freebsd.org wrote: On Wednesday, February 20, 2013 7:31:08 am Svatopluk Kraus wrote: On Tue, Feb 19, 2013 at 7:51 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Feb 18, 2013 at 11:18:16PM +0100, Svatopluk Kraus wrote: On Mon, Feb 18, 2013 at 9:36 PM, Konstantin Belousov kostik...@gmail.com wrote: Well, I'm taking a part on porting FreeBSD to ARM11mpcore. UP case was simple. SMP case is more complex and rather new for me. Recently, I was solving a problem with PCPU stuff. For example, PCPU_GET is implemented by one instruction on i386 arch. So, a need of atomicity with respect to interrupts can be overlooked. On load-store archs, the implementation which works in SMP case is not so simple. And what works in UP case (single PCPU), not works in SMP case. Believe me, mysterious and sporadic 'mutex not owned' assertions and others ones caused by curthreads mess, it takes a while ... Note that PCPU_GET() is not needed to be atomic. The reason is that the code which uses its result would not be atomic (single-instruction or whatever, see below). Thus, either the preemption should not matter since the action with the per-cpu data is advisory, as is in the case of i386 pmap you noted. or some external measures should be applied in advance to the containing region (which you proposed, by bracing with sched_pin()). So, it's advisory in the case of i386 pmap... Well, if you can live with that, I can too. Also, note that it is not interrupts but preemption which is concern. Yes and no. In theory, yes, a preemption is a concern. In FreeBSD, however, sched_pin() and critical_enter() and their counterparts are implemented with help of curthread. And curthread definition falls to PCPU_GET(curthread) if not defined in other way. So, curthread should be atomic with respect to interrupts and in general, PCPU_GET() should be too. Note that spinlock_enter() definitions often (always) use curthread too. Anyhow, it's defined in MD code and can be defined for each arch separately. curthread is a bit magic. :) If you perform a context switch during an interrupt (which will change 'curthread') you also change your register state. When you resume, the register state is also restored. This means that while something like 'PCPU_GET(cpuid)' might be stale after you read it, 'curthread' never is. However, it is true that actually reading curthread has to be atomic. If your read of curthread looks like: mov pcpu reg, r0 add offset of pc_curthread, r0 ld r0, r1 Then that will indeed break. Alpha used a fixed register for 'pcpu_reg' (as does ia64 IIRC). OTOH, you might also be able to depend on the fact that pc_curthread is the first thing in 'struct pcpu' (and always will be, you could add a CTASSERT to future-proof). In that case, you can remove the 'add' instruction and instead just do: ld pcpu reg, r1 which is fine. Just for the record. There are three extra (coprocessor) registers per a core in arm11mpcore (armv6k). Unfortunately only one is Privileged. The other ones are respectively User Read only and User Read Write. For now, we are using the Privileged one to store pcpu pointer (curthread is correctly set during each context switch). Thus, using a coprocessor register means that reading of curthread is not a single instruction implementation now. After we figured out the curthread issue in SMP case, using a fixed (processor) register for pcpu is an option. Meanwhile, we disable interrupts before reading of curthread and enable them after. The same is done for other PCPU stuff. For now we have not stable system enough for profiling, however, when it will be, it would be interesting to learn how various implementations of curthread reading impact system performance. curthread is read _a lot_, so I would expect this to hurt. What we did on Alpha was to use a fixed register for pcpu access, but we used the equivalent of a coprocessor register to also store the value so we could set that fixed register on entry to the kernel (userland was free to use the register for its own
Re: [patch] i386 pmap sysmaps_pcpu[] atomic access
On Mon, Feb 25, 2013 at 07:09:20PM +1300, Andrew Turner wrote: On Thu, 21 Feb 2013 10:43:49 -0500 John Baldwin j...@freebsd.org wrote: On Thursday, February 21, 2013 7:53:52 am Svatopluk Kraus wrote: On Wed, Feb 20, 2013 at 4:22 PM, John Baldwin j...@freebsd.org wrote: On Wednesday, February 20, 2013 7:31:08 am Svatopluk Kraus wrote: On Tue, Feb 19, 2013 at 7:51 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Feb 18, 2013 at 11:18:16PM +0100, Svatopluk Kraus wrote: On Mon, Feb 18, 2013 at 9:36 PM, Konstantin Belousov kostik...@gmail.com wrote: Well, I'm taking a part on porting FreeBSD to ARM11mpcore. UP case was simple. SMP case is more complex and rather new for me. Recently, I was solving a problem with PCPU stuff. For example, PCPU_GET is implemented by one instruction on i386 arch. So, a need of atomicity with respect to interrupts can be overlooked. On load-store archs, the implementation which works in SMP case is not so simple. And what works in UP case (single PCPU), not works in SMP case. Believe me, mysterious and sporadic 'mutex not owned' assertions and others ones caused by curthreads mess, it takes a while ... Note that PCPU_GET() is not needed to be atomic. The reason is that the code which uses its result would not be atomic (single-instruction or whatever, see below). Thus, either the preemption should not matter since the action with the per-cpu data is advisory, as is in the case of i386 pmap you noted. or some external measures should be applied in advance to the containing region (which you proposed, by bracing with sched_pin()). So, it's advisory in the case of i386 pmap... Well, if you can live with that, I can too. Also, note that it is not interrupts but preemption which is concern. Yes and no. In theory, yes, a preemption is a concern. In FreeBSD, however, sched_pin() and critical_enter() and their counterparts are implemented with help of curthread. And curthread definition falls to PCPU_GET(curthread) if not defined in other way. So, curthread should be atomic with respect to interrupts and in general, PCPU_GET() should be too. Note that spinlock_enter() definitions often (always) use curthread too. Anyhow, it's defined in MD code and can be defined for each arch separately. curthread is a bit magic. :) If you perform a context switch during an interrupt (which will change 'curthread') you also change your register state. When you resume, the register state is also restored. This means that while something like 'PCPU_GET(cpuid)' might be stale after you read it, 'curthread' never is. However, it is true that actually reading curthread has to be atomic. If your read of curthread looks like: mov pcpu reg, r0 add offset of pc_curthread, r0 ld r0, r1 Then that will indeed break. Alpha used a fixed register for 'pcpu_reg' (as does ia64 IIRC). OTOH, you might also be able to depend on the fact that pc_curthread is the first thing in 'struct pcpu' (and always will be, you could add a CTASSERT to future-proof). In that case, you can remove the 'add' instruction and instead just do: ld pcpu reg, r1 which is fine. Just for the record. There are three extra (coprocessor) registers per a core in arm11mpcore (armv6k). Unfortunately only one is Privileged. The other ones are respectively User Read only and User Read Write. For now, we are using the Privileged one to store pcpu pointer (curthread is correctly set during each context switch). Thus, using a coprocessor register means that reading of curthread is not a single instruction implementation now. After we figured out the curthread issue in SMP case, using a fixed (processor) register for pcpu is an option. Meanwhile, we disable interrupts before reading of curthread and enable them after. The same is done for other PCPU stuff. For now we have not stable system enough for profiling, however, when it will be, it would be interesting to learn how various implementations of curthread reading impact system performance. curthread is read _a lot_, so I would expect this to hurt. What we did on Alpha was to use a fixed register for pcpu access, but we used the equivalent of a coprocessor register to also store the value so we could set that fixed register on entry to the kernel (userland was free to use the register for its own purposes). The current code on ARM is not atomic, it loads the base address of the pcpu data from the coprocessor then loads curthread from this address. One solution I discussed with Olivier Houchard is to keep the data in the coprocessor but to then load it into a dedicated register when we enter the
Re: [patch] i386 pmap sysmaps_pcpu[] atomic access
On Thu, 21 Feb 2013 10:43:49 -0500 John Baldwin j...@freebsd.org wrote: On Thursday, February 21, 2013 7:53:52 am Svatopluk Kraus wrote: On Wed, Feb 20, 2013 at 4:22 PM, John Baldwin j...@freebsd.org wrote: On Wednesday, February 20, 2013 7:31:08 am Svatopluk Kraus wrote: On Tue, Feb 19, 2013 at 7:51 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Feb 18, 2013 at 11:18:16PM +0100, Svatopluk Kraus wrote: On Mon, Feb 18, 2013 at 9:36 PM, Konstantin Belousov kostik...@gmail.com wrote: Well, I'm taking a part on porting FreeBSD to ARM11mpcore. UP case was simple. SMP case is more complex and rather new for me. Recently, I was solving a problem with PCPU stuff. For example, PCPU_GET is implemented by one instruction on i386 arch. So, a need of atomicity with respect to interrupts can be overlooked. On load-store archs, the implementation which works in SMP case is not so simple. And what works in UP case (single PCPU), not works in SMP case. Believe me, mysterious and sporadic 'mutex not owned' assertions and others ones caused by curthreads mess, it takes a while ... Note that PCPU_GET() is not needed to be atomic. The reason is that the code which uses its result would not be atomic (single-instruction or whatever, see below). Thus, either the preemption should not matter since the action with the per-cpu data is advisory, as is in the case of i386 pmap you noted. or some external measures should be applied in advance to the containing region (which you proposed, by bracing with sched_pin()). So, it's advisory in the case of i386 pmap... Well, if you can live with that, I can too. Also, note that it is not interrupts but preemption which is concern. Yes and no. In theory, yes, a preemption is a concern. In FreeBSD, however, sched_pin() and critical_enter() and their counterparts are implemented with help of curthread. And curthread definition falls to PCPU_GET(curthread) if not defined in other way. So, curthread should be atomic with respect to interrupts and in general, PCPU_GET() should be too. Note that spinlock_enter() definitions often (always) use curthread too. Anyhow, it's defined in MD code and can be defined for each arch separately. curthread is a bit magic. :) If you perform a context switch during an interrupt (which will change 'curthread') you also change your register state. When you resume, the register state is also restored. This means that while something like 'PCPU_GET(cpuid)' might be stale after you read it, 'curthread' never is. However, it is true that actually reading curthread has to be atomic. If your read of curthread looks like: mov pcpu reg, r0 add offset of pc_curthread, r0 ld r0, r1 Then that will indeed break. Alpha used a fixed register for 'pcpu_reg' (as does ia64 IIRC). OTOH, you might also be able to depend on the fact that pc_curthread is the first thing in 'struct pcpu' (and always will be, you could add a CTASSERT to future-proof). In that case, you can remove the 'add' instruction and instead just do: ld pcpu reg, r1 which is fine. Just for the record. There are three extra (coprocessor) registers per a core in arm11mpcore (armv6k). Unfortunately only one is Privileged. The other ones are respectively User Read only and User Read Write. For now, we are using the Privileged one to store pcpu pointer (curthread is correctly set during each context switch). Thus, using a coprocessor register means that reading of curthread is not a single instruction implementation now. After we figured out the curthread issue in SMP case, using a fixed (processor) register for pcpu is an option. Meanwhile, we disable interrupts before reading of curthread and enable them after. The same is done for other PCPU stuff. For now we have not stable system enough for profiling, however, when it will be, it would be interesting to learn how various implementations of curthread reading impact system performance. curthread is read _a lot_, so I would expect this to hurt. What we did on Alpha was to use a fixed register for pcpu access, but we used the equivalent of a coprocessor register to also store the value so we could set that fixed register on entry to the kernel (userland was free to use the register for its own purposes). The current code on ARM is not atomic, it loads the base address of the pcpu data from the coprocessor then loads curthread from this address. One solution I discussed with Olivier Houchard is to keep the data in the coprocessor but to then load it into a dedicated register when we enter the kernel. Reading curthread would then be a single load instruction thanks to ARM's ldr using an immediate offset [1], e.g. to load from r12 into r1 it would be 'ldr r1,
Re: [patch] i386 pmap sysmaps_pcpu[] atomic access
On Wed, Feb 20, 2013 at 4:22 PM, John Baldwin j...@freebsd.org wrote: On Wednesday, February 20, 2013 7:31:08 am Svatopluk Kraus wrote: On Tue, Feb 19, 2013 at 7:51 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Feb 18, 2013 at 11:18:16PM +0100, Svatopluk Kraus wrote: On Mon, Feb 18, 2013 at 9:36 PM, Konstantin Belousov kostik...@gmail.com wrote: Well, I'm taking a part on porting FreeBSD to ARM11mpcore. UP case was simple. SMP case is more complex and rather new for me. Recently, I was solving a problem with PCPU stuff. For example, PCPU_GET is implemented by one instruction on i386 arch. So, a need of atomicity with respect to interrupts can be overlooked. On load-store archs, the implementation which works in SMP case is not so simple. And what works in UP case (single PCPU), not works in SMP case. Believe me, mysterious and sporadic 'mutex not owned' assertions and others ones caused by curthreads mess, it takes a while ... Note that PCPU_GET() is not needed to be atomic. The reason is that the code which uses its result would not be atomic (single-instruction or whatever, see below). Thus, either the preemption should not matter since the action with the per-cpu data is advisory, as is in the case of i386 pmap you noted. or some external measures should be applied in advance to the containing region (which you proposed, by bracing with sched_pin()). So, it's advisory in the case of i386 pmap... Well, if you can live with that, I can too. Also, note that it is not interrupts but preemption which is concern. Yes and no. In theory, yes, a preemption is a concern. In FreeBSD, however, sched_pin() and critical_enter() and their counterparts are implemented with help of curthread. And curthread definition falls to PCPU_GET(curthread) if not defined in other way. So, curthread should be atomic with respect to interrupts and in general, PCPU_GET() should be too. Note that spinlock_enter() definitions often (always) use curthread too. Anyhow, it's defined in MD code and can be defined for each arch separately. curthread is a bit magic. :) If you perform a context switch during an interrupt (which will change 'curthread') you also change your register state. When you resume, the register state is also restored. This means that while something like 'PCPU_GET(cpuid)' might be stale after you read it, 'curthread' never is. However, it is true that actually reading curthread has to be atomic. If your read of curthread looks like: mov pcpu reg, r0 add offset of pc_curthread, r0 ld r0, r1 Then that will indeed break. Alpha used a fixed register for 'pcpu_reg' (as does ia64 IIRC). OTOH, you might also be able to depend on the fact that pc_curthread is the first thing in 'struct pcpu' (and always will be, you could add a CTASSERT to future-proof). In that case, you can remove the 'add' instruction and instead just do: ld pcpu reg, r1 which is fine. Just for the record. There are three extra (coprocessor) registers per a core in arm11mpcore (armv6k). Unfortunately only one is Privileged. The other ones are respectively User Read only and User Read Write. For now, we are using the Privileged one to store pcpu pointer (curthread is correctly set during each context switch). Thus, using a coprocessor register means that reading of curthread is not a single instruction implementation now. After we figured out the curthread issue in SMP case, using a fixed (processor) register for pcpu is an option. Meanwhile, we disable interrupts before reading of curthread and enable them after. The same is done for other PCPU stuff. For now we have not stable system enough for profiling, however, when it will be, it would be interesting to learn how various implementations of curthread reading impact system performance. Svata ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [patch] i386 pmap sysmaps_pcpu[] atomic access
On Thursday, February 21, 2013 7:53:52 am Svatopluk Kraus wrote: On Wed, Feb 20, 2013 at 4:22 PM, John Baldwin j...@freebsd.org wrote: On Wednesday, February 20, 2013 7:31:08 am Svatopluk Kraus wrote: On Tue, Feb 19, 2013 at 7:51 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Feb 18, 2013 at 11:18:16PM +0100, Svatopluk Kraus wrote: On Mon, Feb 18, 2013 at 9:36 PM, Konstantin Belousov kostik...@gmail.com wrote: Well, I'm taking a part on porting FreeBSD to ARM11mpcore. UP case was simple. SMP case is more complex and rather new for me. Recently, I was solving a problem with PCPU stuff. For example, PCPU_GET is implemented by one instruction on i386 arch. So, a need of atomicity with respect to interrupts can be overlooked. On load-store archs, the implementation which works in SMP case is not so simple. And what works in UP case (single PCPU), not works in SMP case. Believe me, mysterious and sporadic 'mutex not owned' assertions and others ones caused by curthreads mess, it takes a while ... Note that PCPU_GET() is not needed to be atomic. The reason is that the code which uses its result would not be atomic (single-instruction or whatever, see below). Thus, either the preemption should not matter since the action with the per-cpu data is advisory, as is in the case of i386 pmap you noted. or some external measures should be applied in advance to the containing region (which you proposed, by bracing with sched_pin()). So, it's advisory in the case of i386 pmap... Well, if you can live with that, I can too. Also, note that it is not interrupts but preemption which is concern. Yes and no. In theory, yes, a preemption is a concern. In FreeBSD, however, sched_pin() and critical_enter() and their counterparts are implemented with help of curthread. And curthread definition falls to PCPU_GET(curthread) if not defined in other way. So, curthread should be atomic with respect to interrupts and in general, PCPU_GET() should be too. Note that spinlock_enter() definitions often (always) use curthread too. Anyhow, it's defined in MD code and can be defined for each arch separately. curthread is a bit magic. :) If you perform a context switch during an interrupt (which will change 'curthread') you also change your register state. When you resume, the register state is also restored. This means that while something like 'PCPU_GET(cpuid)' might be stale after you read it, 'curthread' never is. However, it is true that actually reading curthread has to be atomic. If your read of curthread looks like: mov pcpu reg, r0 add offset of pc_curthread, r0 ld r0, r1 Then that will indeed break. Alpha used a fixed register for 'pcpu_reg' (as does ia64 IIRC). OTOH, you might also be able to depend on the fact that pc_curthread is the first thing in 'struct pcpu' (and always will be, you could add a CTASSERT to future-proof). In that case, you can remove the 'add' instruction and instead just do: ld pcpu reg, r1 which is fine. Just for the record. There are three extra (coprocessor) registers per a core in arm11mpcore (armv6k). Unfortunately only one is Privileged. The other ones are respectively User Read only and User Read Write. For now, we are using the Privileged one to store pcpu pointer (curthread is correctly set during each context switch). Thus, using a coprocessor register means that reading of curthread is not a single instruction implementation now. After we figured out the curthread issue in SMP case, using a fixed (processor) register for pcpu is an option. Meanwhile, we disable interrupts before reading of curthread and enable them after. The same is done for other PCPU stuff. For now we have not stable system enough for profiling, however, when it will be, it would be interesting to learn how various implementations of curthread reading impact system performance. curthread is read _a lot_, so I would expect this to hurt. What we did on Alpha was to use a fixed register for pcpu access, but we used the equivalent of a coprocessor register to also store the value so we could set that fixed register on entry to the kernel (userland was free to use the register for its own purposes). -- John Baldwin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [patch] i386 pmap sysmaps_pcpu[] atomic access
On Thu, Feb 21, 2013 at 4:43 PM, John Baldwin j...@freebsd.org wrote: curthread is a bit magic. :) If you perform a context switch during an interrupt (which will change 'curthread') you also change your register state. When you resume, the register state is also restored. This means that while something like 'PCPU_GET(cpuid)' might be stale after you read it, 'curthread' never is. However, it is true that actually reading curthread has to be atomic. If your read of curthread looks like: mov pcpu reg, r0 add offset of pc_curthread, r0 ld r0, r1 Then that will indeed break. Alpha used a fixed register for 'pcpu_reg' (as does ia64 IIRC). OTOH, you might also be able to depend on the fact that pc_curthread is the first thing in 'struct pcpu' (and always will be, you could add a CTASSERT to future-proof). In that case, you can remove the 'add' instruction and instead just do: ld pcpu reg, r1 which is fine. Just for the record. There are three extra (coprocessor) registers per a core in arm11mpcore (armv6k). Unfortunately only one is Privileged. The other ones are respectively User Read only and User Read Write. For now, we are using the Privileged one to store pcpu pointer (curthread is correctly set during each context switch). Thus, using a coprocessor register means that reading of curthread is not a single instruction implementation now. After we figured out the curthread issue in SMP case, using a fixed (processor) register for pcpu is an option. Meanwhile, we disable interrupts before reading of curthread and enable them after. The same is done for other PCPU stuff. For now we have not stable system enough for profiling, however, when it will be, it would be interesting to learn how various implementations of curthread reading impact system performance. curthread is read _a lot_, so I would expect this to hurt. What we did on Alpha was to use a fixed register for pcpu access, but we used the equivalent of a coprocessor register to also store the value so we could set that fixed register on entry to the kernel (userland was free to use the register for its own purposes). Interesting solution. Thanks. Svata ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [patch] i386 pmap sysmaps_pcpu[] atomic access
On Tue, Feb 19, 2013 at 7:51 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Feb 18, 2013 at 11:18:16PM +0100, Svatopluk Kraus wrote: On Mon, Feb 18, 2013 at 9:36 PM, Konstantin Belousov kostik...@gmail.com wrote: Well, I'm taking a part on porting FreeBSD to ARM11mpcore. UP case was simple. SMP case is more complex and rather new for me. Recently, I was solving a problem with PCPU stuff. For example, PCPU_GET is implemented by one instruction on i386 arch. So, a need of atomicity with respect to interrupts can be overlooked. On load-store archs, the implementation which works in SMP case is not so simple. And what works in UP case (single PCPU), not works in SMP case. Believe me, mysterious and sporadic 'mutex not owned' assertions and others ones caused by curthreads mess, it takes a while ... Note that PCPU_GET() is not needed to be atomic. The reason is that the code which uses its result would not be atomic (single-instruction or whatever, see below). Thus, either the preemption should not matter since the action with the per-cpu data is advisory, as is in the case of i386 pmap you noted. or some external measures should be applied in advance to the containing region (which you proposed, by bracing with sched_pin()). So, it's advisory in the case of i386 pmap... Well, if you can live with that, I can too. Also, note that it is not interrupts but preemption which is concern. Yes and no. In theory, yes, a preemption is a concern. In FreeBSD, however, sched_pin() and critical_enter() and their counterparts are implemented with help of curthread. And curthread definition falls to PCPU_GET(curthread) if not defined in other way. So, curthread should be atomic with respect to interrupts and in general, PCPU_GET() should be too. Note that spinlock_enter() definitions often (always) use curthread too. Anyhow, it's defined in MD code and can be defined for each arch separately. After this, I took a look at how PCPU stuff is used in whole kernel and found out mentioned here i386 pmap issue. So, my concern is following: 1. to verify my newly gained ideas how per CPU data should be used, 2. to decide how to change my implementation of pmap on ARM11mpcore, as it's based on i386 pmap, 3. to make FreeBSD code better. In the meanwhile, it looks that using data dedicated to one CPU on another one is OK. However, I can't agree. At least, without comments, it is misleading for anyone new in FreeBSD and makes code misty. Both threads in your description make useful progress, and computation proceeds correctly. I thought, there is only one thread in my example. One thread running on CPU1, but holding sysmaps dedicated to CPU0 instead of holding sysmaps dedicated to CPU1. So, any thread running on CPU0 must wait because the thread running on CPU1 is a thief. Futhermore, the idea that a thread on CPU1 should hold data for CPU1 is not valid. So, either some comment is missing in the code that it's OK or the using of PCPU_GET(cpuid) is unneeded and some kind of free sysmaps list can be used and it will serve better. What is wrong on almost all architectures except x86 is PCPU_ADD(), which is usually supposed by the MD code to be atomic in regard to _both_ interrupts and preemption. I said almost all, but in fact I am not aware of any architecture except x86 where it is right. And, RISCs with the load-link and store-conditional (ll/sc) primitives are well capable of doing the PCPU_ADD() correctly. You could look at the projects/counters branch, where single-instruction increment is used on x86 for dynamic per-cpu counters, and critical_enter() for other architectures. I might do some work for other arches, but the counters are correct but slower that it could be, on !x86. Thanks to help me to settle my thoughts. Svata ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [patch] i386 pmap sysmaps_pcpu[] atomic access
On Wednesday, February 20, 2013 7:31:08 am Svatopluk Kraus wrote: On Tue, Feb 19, 2013 at 7:51 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Feb 18, 2013 at 11:18:16PM +0100, Svatopluk Kraus wrote: On Mon, Feb 18, 2013 at 9:36 PM, Konstantin Belousov kostik...@gmail.com wrote: Well, I'm taking a part on porting FreeBSD to ARM11mpcore. UP case was simple. SMP case is more complex and rather new for me. Recently, I was solving a problem with PCPU stuff. For example, PCPU_GET is implemented by one instruction on i386 arch. So, a need of atomicity with respect to interrupts can be overlooked. On load-store archs, the implementation which works in SMP case is not so simple. And what works in UP case (single PCPU), not works in SMP case. Believe me, mysterious and sporadic 'mutex not owned' assertions and others ones caused by curthreads mess, it takes a while ... Note that PCPU_GET() is not needed to be atomic. The reason is that the code which uses its result would not be atomic (single-instruction or whatever, see below). Thus, either the preemption should not matter since the action with the per-cpu data is advisory, as is in the case of i386 pmap you noted. or some external measures should be applied in advance to the containing region (which you proposed, by bracing with sched_pin()). So, it's advisory in the case of i386 pmap... Well, if you can live with that, I can too. Also, note that it is not interrupts but preemption which is concern. Yes and no. In theory, yes, a preemption is a concern. In FreeBSD, however, sched_pin() and critical_enter() and their counterparts are implemented with help of curthread. And curthread definition falls to PCPU_GET(curthread) if not defined in other way. So, curthread should be atomic with respect to interrupts and in general, PCPU_GET() should be too. Note that spinlock_enter() definitions often (always) use curthread too. Anyhow, it's defined in MD code and can be defined for each arch separately. curthread is a bit magic. :) If you perform a context switch during an interrupt (which will change 'curthread') you also change your register state. When you resume, the register state is also restored. This means that while something like 'PCPU_GET(cpuid)' might be stale after you read it, 'curthread' never is. However, it is true that actually reading curthread has to be atomic. If your read of curthread looks like: mov pcpu reg, r0 add offset of pc_curthread, r0 ld r0, r1 Then that will indeed break. Alpha used a fixed register for 'pcpu_reg' (as does ia64 IIRC). OTOH, you might also be able to depend on the fact that pc_curthread is the first thing in 'struct pcpu' (and always will be, you could add a CTASSERT to future-proof). In that case, you can remove the 'add' instruction and instead just do: ld pcpu reg, r1 which is fine. -- John Baldwin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [patch] i386 pmap sysmaps_pcpu[] atomic access
On Wed, Feb 20, 2013 at 10:22:29AM -0500, John Baldwin wrote: On Wednesday, February 20, 2013 7:31:08 am Svatopluk Kraus wrote: On Tue, Feb 19, 2013 at 7:51 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Feb 18, 2013 at 11:18:16PM +0100, Svatopluk Kraus wrote: On Mon, Feb 18, 2013 at 9:36 PM, Konstantin Belousov kostik...@gmail.com wrote: Well, I'm taking a part on porting FreeBSD to ARM11mpcore. UP case was simple. SMP case is more complex and rather new for me. Recently, I was solving a problem with PCPU stuff. For example, PCPU_GET is implemented by one instruction on i386 arch. So, a need of atomicity with respect to interrupts can be overlooked. On load-store archs, the implementation which works in SMP case is not so simple. And what works in UP case (single PCPU), not works in SMP case. Believe me, mysterious and sporadic 'mutex not owned' assertions and others ones caused by curthreads mess, it takes a while ... Note that PCPU_GET() is not needed to be atomic. The reason is that the code which uses its result would not be atomic (single-instruction or whatever, see below). Thus, either the preemption should not matter since the action with the per-cpu data is advisory, as is in the case of i386 pmap you noted. or some external measures should be applied in advance to the containing region (which you proposed, by bracing with sched_pin()). So, it's advisory in the case of i386 pmap... Well, if you can live with that, I can too. Also, note that it is not interrupts but preemption which is concern. Yes and no. In theory, yes, a preemption is a concern. In FreeBSD, however, sched_pin() and critical_enter() and their counterparts are implemented with help of curthread. And curthread definition falls to PCPU_GET(curthread) if not defined in other way. So, curthread should be atomic with respect to interrupts and in general, PCPU_GET() should be too. Note that spinlock_enter() definitions often (always) use curthread too. Anyhow, it's defined in MD code and can be defined for each arch separately. curthread is a bit magic. :) If you perform a context switch during an interrupt (which will change 'curthread') you also change your register state. When you resume, the register state is also restored. This means that while something like 'PCPU_GET(cpuid)' might be stale after you read it, 'curthread' never is. However, it is true that actually reading curthread has to be atomic. If your read of curthread looks like: mov pcpu reg, r0 add offset of pc_curthread, r0 ld r0, r1 Then that will indeed break. Alpha used a fixed register for 'pcpu_reg' (as does ia64 IIRC). OTOH, you might also be able to depend on the fact that pc_curthread is the first thing in 'struct pcpu' (and always will be, you could add a CTASSERT to future-proof). In that case, you can remove the 'add' instruction and instead just do: ld pcpu reg, r1 which is fine. I just looked at the arm pcpu.h, and indeed the curthread falls back to the default implementation from sys/pcpu.h, which is get_pcpu()-pc_curthread. This looks buggy for SMP, does our arm port support any multi-cpu configuration ? pgpoLvGDsLIGn.pgp Description: PGP signature
Re: [patch] i386 pmap sysmaps_pcpu[] atomic access
On Wednesday, February 20, 2013 2:27:39 pm Konstantin Belousov wrote: On Wed, Feb 20, 2013 at 10:22:29AM -0500, John Baldwin wrote: On Wednesday, February 20, 2013 7:31:08 am Svatopluk Kraus wrote: On Tue, Feb 19, 2013 at 7:51 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Feb 18, 2013 at 11:18:16PM +0100, Svatopluk Kraus wrote: On Mon, Feb 18, 2013 at 9:36 PM, Konstantin Belousov kostik...@gmail.com wrote: Well, I'm taking a part on porting FreeBSD to ARM11mpcore. UP case was simple. SMP case is more complex and rather new for me. Recently, I was solving a problem with PCPU stuff. For example, PCPU_GET is implemented by one instruction on i386 arch. So, a need of atomicity with respect to interrupts can be overlooked. On load-store archs, the implementation which works in SMP case is not so simple. And what works in UP case (single PCPU), not works in SMP case. Believe me, mysterious and sporadic 'mutex not owned' assertions and others ones caused by curthreads mess, it takes a while ... Note that PCPU_GET() is not needed to be atomic. The reason is that the code which uses its result would not be atomic (single-instruction or whatever, see below). Thus, either the preemption should not matter since the action with the per-cpu data is advisory, as is in the case of i386 pmap you noted. or some external measures should be applied in advance to the containing region (which you proposed, by bracing with sched_pin()). So, it's advisory in the case of i386 pmap... Well, if you can live with that, I can too. Also, note that it is not interrupts but preemption which is concern. Yes and no. In theory, yes, a preemption is a concern. In FreeBSD, however, sched_pin() and critical_enter() and their counterparts are implemented with help of curthread. And curthread definition falls to PCPU_GET(curthread) if not defined in other way. So, curthread should be atomic with respect to interrupts and in general, PCPU_GET() should be too. Note that spinlock_enter() definitions often (always) use curthread too. Anyhow, it's defined in MD code and can be defined for each arch separately. curthread is a bit magic. :) If you perform a context switch during an interrupt (which will change 'curthread') you also change your register state. When you resume, the register state is also restored. This means that while something like 'PCPU_GET(cpuid)' might be stale after you read it, 'curthread' never is. However, it is true that actually reading curthread has to be atomic. If your read of curthread looks like: mov pcpu reg, r0 add offset of pc_curthread, r0 ld r0, r1 Then that will indeed break. Alpha used a fixed register for 'pcpu_reg' (as does ia64 IIRC). OTOH, you might also be able to depend on the fact that pc_curthread is the first thing in 'struct pcpu' (and always will be, you could add a CTASSERT to future-proof). In that case, you can remove the 'add' instruction and instead just do: ld pcpu reg, r1 which is fine. I just looked at the arm pcpu.h, and indeed the curthread falls back to the default implementation from sys/pcpu.h, which is get_pcpu()-pc_curthread. This looks buggy for SMP, does our arm port support any multi-cpu configuration ? Not in HEAD. There is a projects branch for armv6 I think that does. -- John Baldwin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [patch] i386 pmap sysmaps_pcpu[] atomic access
On Wed, 2013-02-20 at 14:32 -0500, John Baldwin wrote: On Wednesday, February 20, 2013 2:27:39 pm Konstantin Belousov wrote: On Wed, Feb 20, 2013 at 10:22:29AM -0500, John Baldwin wrote: On Wednesday, February 20, 2013 7:31:08 am Svatopluk Kraus wrote: On Tue, Feb 19, 2013 at 7:51 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Feb 18, 2013 at 11:18:16PM +0100, Svatopluk Kraus wrote: On Mon, Feb 18, 2013 at 9:36 PM, Konstantin Belousov kostik...@gmail.com wrote: Well, I'm taking a part on porting FreeBSD to ARM11mpcore. UP case was simple. SMP case is more complex and rather new for me. Recently, I was solving a problem with PCPU stuff. For example, PCPU_GET is implemented by one instruction on i386 arch. So, a need of atomicity with respect to interrupts can be overlooked. On load-store archs, the implementation which works in SMP case is not so simple. And what works in UP case (single PCPU), not works in SMP case. Believe me, mysterious and sporadic 'mutex not owned' assertions and others ones caused by curthreads mess, it takes a while ... Note that PCPU_GET() is not needed to be atomic. The reason is that the code which uses its result would not be atomic (single-instruction or whatever, see below). Thus, either the preemption should not matter since the action with the per-cpu data is advisory, as is in the case of i386 pmap you noted. or some external measures should be applied in advance to the containing region (which you proposed, by bracing with sched_pin()). So, it's advisory in the case of i386 pmap... Well, if you can live with that, I can too. Also, note that it is not interrupts but preemption which is concern. Yes and no. In theory, yes, a preemption is a concern. In FreeBSD, however, sched_pin() and critical_enter() and their counterparts are implemented with help of curthread. And curthread definition falls to PCPU_GET(curthread) if not defined in other way. So, curthread should be atomic with respect to interrupts and in general, PCPU_GET() should be too. Note that spinlock_enter() definitions often (always) use curthread too. Anyhow, it's defined in MD code and can be defined for each arch separately. curthread is a bit magic. :) If you perform a context switch during an interrupt (which will change 'curthread') you also change your register state. When you resume, the register state is also restored. This means that while something like 'PCPU_GET(cpuid)' might be stale after you read it, 'curthread' never is. However, it is true that actually reading curthread has to be atomic. If your read of curthread looks like: mov pcpu reg, r0 add offset of pc_curthread, r0 ld r0, r1 Then that will indeed break. Alpha used a fixed register for 'pcpu_reg' (as does ia64 IIRC). OTOH, you might also be able to depend on the fact that pc_curthread is the first thing in 'struct pcpu' (and always will be, you could add a CTASSERT to future-proof). In that case, you can remove the 'add' instruction and instead just do: ld pcpu reg, r1 which is fine. I just looked at the arm pcpu.h, and indeed the curthread falls back to the default implementation from sys/pcpu.h, which is get_pcpu()-pc_curthread. This looks buggy for SMP, does our arm port support any multi-cpu configuration ? Not in HEAD. There is a projects branch for armv6 I think that does. There isn't anything I've heard of that makes it all the way to single user mode yet with multiple cores enabled. -- Ian ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [patch] i386 pmap sysmaps_pcpu[] atomic access
On Mon, Feb 18, 2013 at 11:18:16PM +0100, Svatopluk Kraus wrote: On Mon, Feb 18, 2013 at 9:36 PM, Konstantin Belousov kostik...@gmail.com wrote: Well, I'm taking a part on porting FreeBSD to ARM11mpcore. UP case was simple. SMP case is more complex and rather new for me. Recently, I was solving a problem with PCPU stuff. For example, PCPU_GET is implemented by one instruction on i386 arch. So, a need of atomicity with respect to interrupts can be overlooked. On load-store archs, the implementation which works in SMP case is not so simple. And what works in UP case (single PCPU), not works in SMP case. Believe me, mysterious and sporadic 'mutex not owned' assertions and others ones caused by curthreads mess, it takes a while ... Note that PCPU_GET() is not needed to be atomic. The reason is that the code which uses its result would not be atomic (single-instruction or whatever, see below). Thus, either the preemption should not matter since the action with the per-cpu data is advisory, as is in the case of i386 pmap you noted. or some external measures should be applied in advance to the containing region (which you proposed, by bracing with sched_pin()). Also, note that it is not interrupts but preemption which is concern. After this, I took a look at how PCPU stuff is used in whole kernel and found out mentioned here i386 pmap issue. So, my concern is following: 1. to verify my newly gained ideas how per CPU data should be used, 2. to decide how to change my implementation of pmap on ARM11mpcore, as it's based on i386 pmap, 3. to make FreeBSD code better. In the meanwhile, it looks that using data dedicated to one CPU on another one is OK. However, I can't agree. At least, without comments, it is misleading for anyone new in FreeBSD and makes code misty. Both threads in your description make useful progress, and computation proceeds correctly. I thought, there is only one thread in my example. One thread running on CPU1, but holding sysmaps dedicated to CPU0 instead of holding sysmaps dedicated to CPU1. So, any thread running on CPU0 must wait because the thread running on CPU1 is a thief. Futhermore, the idea that a thread on CPU1 should hold data for CPU1 is not valid. So, either some comment is missing in the code that it's OK or the using of PCPU_GET(cpuid) is unneeded and some kind of free sysmaps list can be used and it will serve better. What is wrong on almost all architectures except x86 is PCPU_ADD(), which is usually supposed by the MD code to be atomic in regard to _both_ interrupts and preemption. I said almost all, but in fact I am not aware of any architecture except x86 where it is right. And, RISCs with the load-link and store-conditional (ll/sc) primitives are well capable of doing the PCPU_ADD() correctly. You could look at the projects/counters branch, where single-instruction increment is used on x86 for dynamic per-cpu counters, and critical_enter() for other architectures. I might do some work for other arches, but the counters are correct but slower that it could be, on !x86. pgpKvcaK3GJAa.pgp Description: PGP signature
Re: [patch] i386 pmap sysmaps_pcpu[] atomic access
On Mon, Feb 18, 2013 at 01:44:35PM +0100, Svatopluk Kraus wrote: Hi, the access to sysmaps_pcpu[] should be atomic with respect to thread migration. Otherwise, a sysmaps for one CPU can be stolen by another CPU and the purpose of per CPU sysmaps is broken. A patch is enclosed. And, what are the problem caused by the 'otherwise' ? I do not see any. Really, taking the mutex while bind to a CPU could be deadlock-prone under some situations. This was discussed at least one more time. Might be, a comment saying that there is no issue should be added. Svata Index: sys/i386/i386/pmap.c === --- sys/i386/i386/pmap.c (revision 246831) +++ sys/i386/i386/pmap.c (working copy) @@ -4146,11 +4146,11 @@ { struct sysmaps *sysmaps; + sched_pin(); sysmaps = sysmaps_pcpu[PCPU_GET(cpuid)]; mtx_lock(sysmaps-lock); if (*sysmaps-CMAP2) panic(pmap_zero_page: CMAP2 busy); - sched_pin(); *sysmaps-CMAP2 = PG_V | PG_RW | VM_PAGE_TO_PHYS(m) | PG_A | PG_M | pmap_cache_bits(m-md.pat_mode, 0); invlcaddr(sysmaps-CADDR2); @@ -4171,11 +4171,11 @@ { struct sysmaps *sysmaps; + sched_pin(); sysmaps = sysmaps_pcpu[PCPU_GET(cpuid)]; mtx_lock(sysmaps-lock); if (*sysmaps-CMAP2) panic(pmap_zero_page_area: CMAP2 busy); - sched_pin(); *sysmaps-CMAP2 = PG_V | PG_RW | VM_PAGE_TO_PHYS(m) | PG_A | PG_M | pmap_cache_bits(m-md.pat_mode, 0); invlcaddr(sysmaps-CADDR2); @@ -4220,13 +4220,13 @@ { struct sysmaps *sysmaps; + sched_pin(); sysmaps = sysmaps_pcpu[PCPU_GET(cpuid)]; mtx_lock(sysmaps-lock); if (*sysmaps-CMAP1) panic(pmap_copy_page: CMAP1 busy); if (*sysmaps-CMAP2) panic(pmap_copy_page: CMAP2 busy); - sched_pin(); invlpg((u_int)sysmaps-CADDR1); invlpg((u_int)sysmaps-CADDR2); *sysmaps-CMAP1 = PG_V | VM_PAGE_TO_PHYS(src) | PG_A | @@ -5072,11 +5072,11 @@ vm_offset_t sva, eva; if ((cpu_feature CPUID_CLFSH) != 0) { + sched_pin(); sysmaps = sysmaps_pcpu[PCPU_GET(cpuid)]; mtx_lock(sysmaps-lock); if (*sysmaps-CMAP2) panic(pmap_flush_page: CMAP2 busy); - sched_pin(); *sysmaps-CMAP2 = PG_V | PG_RW | VM_PAGE_TO_PHYS(m) | PG_A | PG_M | pmap_cache_bits(m-md.pat_mode, 0); invlcaddr(sysmaps-CADDR2); ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org pgpJybqFrB7Kk.pgp Description: PGP signature
Re: [patch] i386 pmap sysmaps_pcpu[] atomic access
On Mon, Feb 18, 2013 at 06:06:42PM +0100, Svatopluk Kraus wrote: On Mon, Feb 18, 2013 at 4:08 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Feb 18, 2013 at 01:44:35PM +0100, Svatopluk Kraus wrote: Hi, the access to sysmaps_pcpu[] should be atomic with respect to thread migration. Otherwise, a sysmaps for one CPU can be stolen by another CPU and the purpose of per CPU sysmaps is broken. A patch is enclosed. And, what are the problem caused by the 'otherwise' ? I do not see any. The 'otherwise' issue is the following: 1. A thread is running on CPU0. sysmaps = sysmaps_pcpu[PCPU_GET(cpuid)]; 2. A sysmaps variable contains a pointer to 'CPU0' sysmaps. 3. Now, the thread migrates into CPU1. 4. However, the sysmaps variable still contains a pointers to 'CPU0' sysmaps. mtx_lock(sysmaps-lock); 4. The thread running on CPU1 locked 'CPU0' sysmaps mutex, so the thread uselessly can block another thread running on CPU0. Maybe, it's not a problem. However, it definitely goes against the reason why the submaps (one for each CPU) exist. So what ? Really, taking the mutex while bind to a CPU could be deadlock-prone under some situations. This was discussed at least one more time. Might be, a comment saying that there is no issue should be added. I missed the discussion. Can you point me to it, please? A deadlock is not problem here, however, I can be wrong, as I can't imagine now how a simple pinning could lead into a deadlock at all. Because some other load on the bind cpu might prevent the thread from being scheduled. Svata Index: sys/i386/i386/pmap.c === --- sys/i386/i386/pmap.c (revision 246831) +++ sys/i386/i386/pmap.c (working copy) @@ -4146,11 +4146,11 @@ { struct sysmaps *sysmaps; + sched_pin(); sysmaps = sysmaps_pcpu[PCPU_GET(cpuid)]; mtx_lock(sysmaps-lock); if (*sysmaps-CMAP2) panic(pmap_zero_page: CMAP2 busy); - sched_pin(); *sysmaps-CMAP2 = PG_V | PG_RW | VM_PAGE_TO_PHYS(m) | PG_A | PG_M | pmap_cache_bits(m-md.pat_mode, 0); invlcaddr(sysmaps-CADDR2); @@ -4171,11 +4171,11 @@ { struct sysmaps *sysmaps; + sched_pin(); sysmaps = sysmaps_pcpu[PCPU_GET(cpuid)]; mtx_lock(sysmaps-lock); if (*sysmaps-CMAP2) panic(pmap_zero_page_area: CMAP2 busy); - sched_pin(); *sysmaps-CMAP2 = PG_V | PG_RW | VM_PAGE_TO_PHYS(m) | PG_A | PG_M | pmap_cache_bits(m-md.pat_mode, 0); invlcaddr(sysmaps-CADDR2); @@ -4220,13 +4220,13 @@ { struct sysmaps *sysmaps; + sched_pin(); sysmaps = sysmaps_pcpu[PCPU_GET(cpuid)]; mtx_lock(sysmaps-lock); if (*sysmaps-CMAP1) panic(pmap_copy_page: CMAP1 busy); if (*sysmaps-CMAP2) panic(pmap_copy_page: CMAP2 busy); - sched_pin(); invlpg((u_int)sysmaps-CADDR1); invlpg((u_int)sysmaps-CADDR2); *sysmaps-CMAP1 = PG_V | VM_PAGE_TO_PHYS(src) | PG_A | @@ -5072,11 +5072,11 @@ vm_offset_t sva, eva; if ((cpu_feature CPUID_CLFSH) != 0) { + sched_pin(); sysmaps = sysmaps_pcpu[PCPU_GET(cpuid)]; mtx_lock(sysmaps-lock); if (*sysmaps-CMAP2) panic(pmap_flush_page: CMAP2 busy); - sched_pin(); *sysmaps-CMAP2 = PG_V | PG_RW | VM_PAGE_TO_PHYS(m) | PG_A | PG_M | pmap_cache_bits(m-md.pat_mode, 0); invlcaddr(sysmaps-CADDR2); ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org pgp9RnaKGdSYD.pgp Description: PGP signature
Re: [patch] i386 pmap sysmaps_pcpu[] atomic access
On Mon, Feb 18, 2013 at 4:08 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Feb 18, 2013 at 01:44:35PM +0100, Svatopluk Kraus wrote: Hi, the access to sysmaps_pcpu[] should be atomic with respect to thread migration. Otherwise, a sysmaps for one CPU can be stolen by another CPU and the purpose of per CPU sysmaps is broken. A patch is enclosed. And, what are the problem caused by the 'otherwise' ? I do not see any. The 'otherwise' issue is the following: 1. A thread is running on CPU0. sysmaps = sysmaps_pcpu[PCPU_GET(cpuid)]; 2. A sysmaps variable contains a pointer to 'CPU0' sysmaps. 3. Now, the thread migrates into CPU1. 4. However, the sysmaps variable still contains a pointers to 'CPU0' sysmaps. mtx_lock(sysmaps-lock); 4. The thread running on CPU1 locked 'CPU0' sysmaps mutex, so the thread uselessly can block another thread running on CPU0. Maybe, it's not a problem. However, it definitely goes against the reason why the submaps (one for each CPU) exist. Really, taking the mutex while bind to a CPU could be deadlock-prone under some situations. This was discussed at least one more time. Might be, a comment saying that there is no issue should be added. I missed the discussion. Can you point me to it, please? A deadlock is not problem here, however, I can be wrong, as I can't imagine now how a simple pinning could lead into a deadlock at all. Svata Index: sys/i386/i386/pmap.c === --- sys/i386/i386/pmap.c (revision 246831) +++ sys/i386/i386/pmap.c (working copy) @@ -4146,11 +4146,11 @@ { struct sysmaps *sysmaps; + sched_pin(); sysmaps = sysmaps_pcpu[PCPU_GET(cpuid)]; mtx_lock(sysmaps-lock); if (*sysmaps-CMAP2) panic(pmap_zero_page: CMAP2 busy); - sched_pin(); *sysmaps-CMAP2 = PG_V | PG_RW | VM_PAGE_TO_PHYS(m) | PG_A | PG_M | pmap_cache_bits(m-md.pat_mode, 0); invlcaddr(sysmaps-CADDR2); @@ -4171,11 +4171,11 @@ { struct sysmaps *sysmaps; + sched_pin(); sysmaps = sysmaps_pcpu[PCPU_GET(cpuid)]; mtx_lock(sysmaps-lock); if (*sysmaps-CMAP2) panic(pmap_zero_page_area: CMAP2 busy); - sched_pin(); *sysmaps-CMAP2 = PG_V | PG_RW | VM_PAGE_TO_PHYS(m) | PG_A | PG_M | pmap_cache_bits(m-md.pat_mode, 0); invlcaddr(sysmaps-CADDR2); @@ -4220,13 +4220,13 @@ { struct sysmaps *sysmaps; + sched_pin(); sysmaps = sysmaps_pcpu[PCPU_GET(cpuid)]; mtx_lock(sysmaps-lock); if (*sysmaps-CMAP1) panic(pmap_copy_page: CMAP1 busy); if (*sysmaps-CMAP2) panic(pmap_copy_page: CMAP2 busy); - sched_pin(); invlpg((u_int)sysmaps-CADDR1); invlpg((u_int)sysmaps-CADDR2); *sysmaps-CMAP1 = PG_V | VM_PAGE_TO_PHYS(src) | PG_A | @@ -5072,11 +5072,11 @@ vm_offset_t sva, eva; if ((cpu_feature CPUID_CLFSH) != 0) { + sched_pin(); sysmaps = sysmaps_pcpu[PCPU_GET(cpuid)]; mtx_lock(sysmaps-lock); if (*sysmaps-CMAP2) panic(pmap_flush_page: CMAP2 busy); - sched_pin(); *sysmaps-CMAP2 = PG_V | PG_RW | VM_PAGE_TO_PHYS(m) | PG_A | PG_M | pmap_cache_bits(m-md.pat_mode, 0); invlcaddr(sysmaps-CADDR2); ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [patch] i386 pmap sysmaps_pcpu[] atomic access
On Mon, Feb 18, 2013 at 6:09 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Feb 18, 2013 at 06:06:42PM +0100, Svatopluk Kraus wrote: On Mon, Feb 18, 2013 at 4:08 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Feb 18, 2013 at 01:44:35PM +0100, Svatopluk Kraus wrote: Hi, the access to sysmaps_pcpu[] should be atomic with respect to thread migration. Otherwise, a sysmaps for one CPU can be stolen by another CPU and the purpose of per CPU sysmaps is broken. A patch is enclosed. And, what are the problem caused by the 'otherwise' ? I do not see any. The 'otherwise' issue is the following: 1. A thread is running on CPU0. sysmaps = sysmaps_pcpu[PCPU_GET(cpuid)]; 2. A sysmaps variable contains a pointer to 'CPU0' sysmaps. 3. Now, the thread migrates into CPU1. 4. However, the sysmaps variable still contains a pointers to 'CPU0' sysmaps. mtx_lock(sysmaps-lock); 4. The thread running on CPU1 locked 'CPU0' sysmaps mutex, so the thread uselessly can block another thread running on CPU0. Maybe, it's not a problem. However, it definitely goes against the reason why the submaps (one for each CPU) exist. So what ? It depends. You don't understand it or you think it's ok? Tell me. Really, taking the mutex while bind to a CPU could be deadlock-prone under some situations. This was discussed at least one more time. Might be, a comment saying that there is no issue should be added. I missed the discussion. Can you point me to it, please? A deadlock is not problem here, however, I can be wrong, as I can't imagine now how a simple pinning could lead into a deadlock at all. Because some other load on the bind cpu might prevent the thread from being scheduled. I'm afraid I still have no idea. On single CPU, a binding has no meaning. Thus, if any deadlock exists then exists without binding too. Hmm, you are talking about a deadlock caused by heavy CPU load? Is it a deadlock at all? Anyhow, mutex is a lock with priority propagation, isn't it? Svata Index: sys/i386/i386/pmap.c === --- sys/i386/i386/pmap.c (revision 246831) +++ sys/i386/i386/pmap.c (working copy) @@ -4146,11 +4146,11 @@ { struct sysmaps *sysmaps; + sched_pin(); sysmaps = sysmaps_pcpu[PCPU_GET(cpuid)]; mtx_lock(sysmaps-lock); if (*sysmaps-CMAP2) panic(pmap_zero_page: CMAP2 busy); - sched_pin(); *sysmaps-CMAP2 = PG_V | PG_RW | VM_PAGE_TO_PHYS(m) | PG_A | PG_M | pmap_cache_bits(m-md.pat_mode, 0); invlcaddr(sysmaps-CADDR2); @@ -4171,11 +4171,11 @@ { struct sysmaps *sysmaps; + sched_pin(); sysmaps = sysmaps_pcpu[PCPU_GET(cpuid)]; mtx_lock(sysmaps-lock); if (*sysmaps-CMAP2) panic(pmap_zero_page_area: CMAP2 busy); - sched_pin(); *sysmaps-CMAP2 = PG_V | PG_RW | VM_PAGE_TO_PHYS(m) | PG_A | PG_M | pmap_cache_bits(m-md.pat_mode, 0); invlcaddr(sysmaps-CADDR2); @@ -4220,13 +4220,13 @@ { struct sysmaps *sysmaps; + sched_pin(); sysmaps = sysmaps_pcpu[PCPU_GET(cpuid)]; mtx_lock(sysmaps-lock); if (*sysmaps-CMAP1) panic(pmap_copy_page: CMAP1 busy); if (*sysmaps-CMAP2) panic(pmap_copy_page: CMAP2 busy); - sched_pin(); invlpg((u_int)sysmaps-CADDR1); invlpg((u_int)sysmaps-CADDR2); *sysmaps-CMAP1 = PG_V | VM_PAGE_TO_PHYS(src) | PG_A | @@ -5072,11 +5072,11 @@ vm_offset_t sva, eva; if ((cpu_feature CPUID_CLFSH) != 0) { + sched_pin(); sysmaps = sysmaps_pcpu[PCPU_GET(cpuid)]; mtx_lock(sysmaps-lock); if (*sysmaps-CMAP2) panic(pmap_flush_page: CMAP2 busy); - sched_pin(); *sysmaps-CMAP2 = PG_V | PG_RW | VM_PAGE_TO_PHYS(m) | PG_A | PG_M | pmap_cache_bits(m-md.pat_mode, 0); invlcaddr(sysmaps-CADDR2); ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [patch] i386 pmap sysmaps_pcpu[] atomic access
On Mon, Feb 18, 2013 at 09:27:40PM +0100, Svatopluk Kraus wrote: On Mon, Feb 18, 2013 at 6:09 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Feb 18, 2013 at 06:06:42PM +0100, Svatopluk Kraus wrote: On Mon, Feb 18, 2013 at 4:08 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Feb 18, 2013 at 01:44:35PM +0100, Svatopluk Kraus wrote: Hi, the access to sysmaps_pcpu[] should be atomic with respect to thread migration. Otherwise, a sysmaps for one CPU can be stolen by another CPU and the purpose of per CPU sysmaps is broken. A patch is enclosed. And, what are the problem caused by the 'otherwise' ? I do not see any. The 'otherwise' issue is the following: 1. A thread is running on CPU0. sysmaps = sysmaps_pcpu[PCPU_GET(cpuid)]; 2. A sysmaps variable contains a pointer to 'CPU0' sysmaps. 3. Now, the thread migrates into CPU1. 4. However, the sysmaps variable still contains a pointers to 'CPU0' sysmaps. mtx_lock(sysmaps-lock); 4. The thread running on CPU1 locked 'CPU0' sysmaps mutex, so the thread uselessly can block another thread running on CPU0. Maybe, it's not a problem. However, it definitely goes against the reason why the submaps (one for each CPU) exist. So what ? It depends. You don't understand it or you think it's ok? Tell me. Both. I do not understand your concern, and I think that the code is fine. Both threads in your description make useful progress, and computation proceeds correctly. Really, taking the mutex while bind to a CPU could be deadlock-prone under some situations. This was discussed at least one more time. Might be, a comment saying that there is no issue should be added. I missed the discussion. Can you point me to it, please? A deadlock is not problem here, however, I can be wrong, as I can't imagine now how a simple pinning could lead into a deadlock at all. Because some other load on the bind cpu might prevent the thread from being scheduled. I'm afraid I still have no idea. On single CPU, a binding has no meaning. Thus, if any deadlock exists then exists without binding too. Hmm, you are talking about a deadlock caused by heavy CPU load? Is it a deadlock at all? Anyhow, mutex is a lock with priority propagation, isn't it? When executing on single cpu, kernel sometimes make different decisions. Yes, the deadlock can be more precisely described as livelock. It might not make any matter for exactly this case, but still is useful to remember. pgpRzTWR25fZB.pgp Description: PGP signature
Re: [patch] i386 pmap sysmaps_pcpu[] atomic access
On Mon, Feb 18, 2013 at 9:36 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Feb 18, 2013 at 09:27:40PM +0100, Svatopluk Kraus wrote: On Mon, Feb 18, 2013 at 6:09 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Feb 18, 2013 at 06:06:42PM +0100, Svatopluk Kraus wrote: On Mon, Feb 18, 2013 at 4:08 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Feb 18, 2013 at 01:44:35PM +0100, Svatopluk Kraus wrote: Hi, the access to sysmaps_pcpu[] should be atomic with respect to thread migration. Otherwise, a sysmaps for one CPU can be stolen by another CPU and the purpose of per CPU sysmaps is broken. A patch is enclosed. And, what are the problem caused by the 'otherwise' ? I do not see any. The 'otherwise' issue is the following: 1. A thread is running on CPU0. sysmaps = sysmaps_pcpu[PCPU_GET(cpuid)]; 2. A sysmaps variable contains a pointer to 'CPU0' sysmaps. 3. Now, the thread migrates into CPU1. 4. However, the sysmaps variable still contains a pointers to 'CPU0' sysmaps. mtx_lock(sysmaps-lock); 4. The thread running on CPU1 locked 'CPU0' sysmaps mutex, so the thread uselessly can block another thread running on CPU0. Maybe, it's not a problem. However, it definitely goes against the reason why the submaps (one for each CPU) exist. So what ? It depends. You don't understand it or you think it's ok? Tell me. Both. I do not understand your concern, and I think that the code is fine. Well, I'm taking a part on porting FreeBSD to ARM11mpcore. UP case was simple. SMP case is more complex and rather new for me. Recently, I was solving a problem with PCPU stuff. For example, PCPU_GET is implemented by one instruction on i386 arch. So, a need of atomicity with respect to interrupts can be overlooked. On load-store archs, the implementation which works in SMP case is not so simple. And what works in UP case (single PCPU), not works in SMP case. Believe me, mysterious and sporadic 'mutex not owned' assertions and others ones caused by curthreads mess, it takes a while ... After this, I took a look at how PCPU stuff is used in whole kernel and found out mentioned here i386 pmap issue. So, my concern is following: 1. to verify my newly gained ideas how per CPU data should be used, 2. to decide how to change my implementation of pmap on ARM11mpcore, as it's based on i386 pmap, 3. to make FreeBSD code better. In the meanwhile, it looks that using data dedicated to one CPU on another one is OK. However, I can't agree. At least, without comments, it is misleading for anyone new in FreeBSD and makes code misty. Both threads in your description make useful progress, and computation proceeds correctly. I thought, there is only one thread in my example. One thread running on CPU1, but holding sysmaps dedicated to CPU0 instead of holding sysmaps dedicated to CPU1. So, any thread running on CPU0 must wait because the thread running on CPU1 is a thief. Futhermore, the idea that a thread on CPU1 should hold data for CPU1 is not valid. So, either some comment is missing in the code that it's OK or the using of PCPU_GET(cpuid) is unneeded and some kind of free sysmaps list can be used and it will serve better. Really, taking the mutex while bind to a CPU could be deadlock-prone under some situations. This was discussed at least one more time. Might be, a comment saying that there is no issue should be added. I missed the discussion. Can you point me to it, please? A deadlock is not problem here, however, I can be wrong, as I can't imagine now how a simple pinning could lead into a deadlock at all. Because some other load on the bind cpu might prevent the thread from being scheduled. I'm afraid I still have no idea. On single CPU, a binding has no meaning. Thus, if any deadlock exists then exists without binding too. Hmm, you are talking about a deadlock caused by heavy CPU load? Is it a deadlock at all? Anyhow, mutex is a lock with priority propagation, isn't it? When executing on single cpu, kernel sometimes make different decisions. Yes, the deadlock can be more precisely described as livelock. It might not make any matter for exactly this case, but still is useful to remember. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org