Re: Getting the current thread ID without a syscall?
On 1/15/13 4:03 PM, Trent Nelson wrote: On Tue, Jan 15, 2013 at 02:33:41PM -0800, Ian Lepore wrote: On Tue, 2013-01-15 at 14:29 -0800, Alfred Perlstein wrote: On 1/15/13 1:43 PM, Konstantin Belousov wrote: On Tue, Jan 15, 2013 at 04:35:14PM -0500, Trent Nelson wrote: Luckily it's for an open source project (Python), so recompilation isn't a big deal. (I also check the intrinsic result versus the syscall result during startup to verify the same ID is returned, falling back to the syscall by default.) For you, may be. For your users, it definitely will be a problem. And worse, the problem will be blamed on the operating system and not to the broken application. Anything we can do to avoid this would be best. The reason is that we are still dealing with an optimization that perl did, it reached inside of the opaque struct FILE to do nasty things. Now it is very difficult for us to fix struct FILE. We are still paying for this years later. Any way we can make this a supported interface? -Alfred Re-reading the original question, I've got to ask why pthread_self() isn't the right answer? The requirement wasn't I need to know what the OS calls me it was I need a unique ID per thread within a process. The identity check is performed hundreds of times per second. The overhead of (Py_MainThreadId == __readgsdword(0x48) ? A() : B()) is negligible -- I can't say the same for a system/function call. (I'm experimenting with an idea I had to parallelize Python such that it can exploit all cores without impeding the performance of normal single-threaded execution (like previous-GIL-removal attempts and STM). It's very promising so far -- presuming we can get the current thread ID in a couple of instructions. If not, single-threaded performance suffers too much.) TLS? Trent. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Getting the current thread ID without a syscall?
Howdy, I have an unusual requirement: I need to get the current thread ID in as few instructions as possible. On Windows, I managed to come up with this glorious hack: #ifdef WITH_INTRINSICS # ifdef MS_WINDOWS # include intrin.h # if defined(MS_WIN64) # pragma intrinsic(__readgsdword) # define _Py_get_current_process_id() (__readgsdword(0x40)) # define _Py_get_current_thread_id() (__readgsdword(0x48)) # elif defined(MS_WIN32) # pragma intrinsic(__readfsdword) # define _Py_get_current_process_id() (__readfsdword(0x20)) # define _Py_get_current_thread_id() (__readfsdword(0x24)) That exploits the fact that Windows uses the FS/GS registers to store thread/process metadata. Could I use a similar approach on FreeBSD to get the thread ID without the need for syscalls? (I technically don't need the thread ID, I just need to get some form of unique identifier for the current thread such that I can compare it to a known global value that's been set to the main thread, in order to determine if I'm currently that thread or not. As long as it's unique for each thread, and static for the lifetime of the thread, that's fine.) The am I the main thread? comparison is made every ~50-100 opcodes, which is why it needs to have the lowest overhead possible. Regards, Trent. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Getting the current thread ID without a syscall?
On Tue, Jan 15, 2013 at 03:54:03PM -0500, Trent Nelson wrote: Howdy, I have an unusual requirement: I need to get the current thread ID in as few instructions as possible. On Windows, I managed to come up with this glorious hack: #ifdef WITH_INTRINSICS # ifdef MS_WINDOWS # include intrin.h # if defined(MS_WIN64) # pragma intrinsic(__readgsdword) # define _Py_get_current_process_id() (__readgsdword(0x40)) # define _Py_get_current_thread_id() (__readgsdword(0x48)) # elif defined(MS_WIN32) # pragma intrinsic(__readfsdword) # define _Py_get_current_process_id() (__readfsdword(0x20)) # define _Py_get_current_thread_id() (__readfsdword(0x24)) That exploits the fact that Windows uses the FS/GS registers to store thread/process metadata. Could I use a similar approach on FreeBSD to get the thread ID without the need for syscalls? The layout of the per-thread structure used by libthr is private and is not guaranteed to be stable even on the stable branches. Yes, you could obtain the tid this way, but note explicitely that using it makes your application not binary compatible with any version of the FreeBSD except the one you compiled on. You could read the _thread_off_tid integer variable and use the value as offset from the %fs base to the long containing the unique thread id. But don't use this in anything except the private code. (I technically don't need the thread ID, I just need to get some form of unique identifier for the current thread such that I can compare it to a known global value that's been set to the main thread, in order to determine if I'm currently that thread or not. As long as it's unique for each thread, and static for the lifetime of the thread, that's fine.) The am I the main thread? comparison is made every ~50-100 opcodes, which is why it needs to have the lowest overhead possible. On newer CPUs in amd64 mode, there is getfsbase instruction which reads the %fs register base. System guarantees that %fs base is unique among live threads. pgpsp_lioKIEy.pgp Description: PGP signature
Re: Getting the current thread ID without a syscall?
On Tue, Jan 15, 2013 at 01:16:41PM -0800, Konstantin Belousov wrote: On Tue, Jan 15, 2013 at 03:54:03PM -0500, Trent Nelson wrote: Howdy, I have an unusual requirement: I need to get the current thread ID in as few instructions as possible. On Windows, I managed to come up with this glorious hack: #ifdef WITH_INTRINSICS # ifdef MS_WINDOWS # include intrin.h # if defined(MS_WIN64) # pragma intrinsic(__readgsdword) # define _Py_get_current_process_id() (__readgsdword(0x40)) # define _Py_get_current_thread_id() (__readgsdword(0x48)) # elif defined(MS_WIN32) # pragma intrinsic(__readfsdword) # define _Py_get_current_process_id() (__readfsdword(0x20)) # define _Py_get_current_thread_id() (__readfsdword(0x24)) That exploits the fact that Windows uses the FS/GS registers to store thread/process metadata. Could I use a similar approach on FreeBSD to get the thread ID without the need for syscalls? The layout of the per-thread structure used by libthr is private and is not guaranteed to be stable even on the stable branches. Yes, you could obtain the tid this way, but note explicitely that using it makes your application not binary compatible with any version of the FreeBSD except the one you compiled on. Luckily it's for an open source project (Python), so recompilation isn't a big deal. (I also check the intrinsic result versus the syscall result during startup to verify the same ID is returned, falling back to the syscall by default.) You could read the _thread_off_tid integer variable and use the value as offset from the %fs base to the long containing the unique thread id. But don't use this in anything except the private code. Ah, thanks, that's what I was interested in knowing. (I technically don't need the thread ID, I just need to get some form of unique identifier for the current thread such that I can compare it to a known global value that's been set to the main thread, in order to determine if I'm currently that thread or not. As long as it's unique for each thread, and static for the lifetime of the thread, that's fine.) The am I the main thread? comparison is made every ~50-100 opcodes, which is why it needs to have the lowest overhead possible. On newer CPUs in amd64 mode, there is getfsbase instruction which reads the %fs register base. System guarantees that %fs base is unique among live threads. Interesting. I was aware of those instructions, but never assessed them in detail once I'd figured out the readgsdword approach. I definitely didn't realize they return unique values per thread (although it makes sense now that I think about it). Thanks Konstantin, very helpful. Trent. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Getting the current thread ID without a syscall?
On Tue, Jan 15, 2013 at 04:35:14PM -0500, Trent Nelson wrote: On Tue, Jan 15, 2013 at 01:16:41PM -0800, Konstantin Belousov wrote: On Tue, Jan 15, 2013 at 03:54:03PM -0500, Trent Nelson wrote: Howdy, I have an unusual requirement: I need to get the current thread ID in as few instructions as possible. On Windows, I managed to come up with this glorious hack: #ifdef WITH_INTRINSICS # ifdef MS_WINDOWS # include intrin.h # if defined(MS_WIN64) # pragma intrinsic(__readgsdword) # define _Py_get_current_process_id() (__readgsdword(0x40)) # define _Py_get_current_thread_id() (__readgsdword(0x48)) # elif defined(MS_WIN32) # pragma intrinsic(__readfsdword) # define _Py_get_current_process_id() (__readfsdword(0x20)) # define _Py_get_current_thread_id() (__readfsdword(0x24)) That exploits the fact that Windows uses the FS/GS registers to store thread/process metadata. Could I use a similar approach on FreeBSD to get the thread ID without the need for syscalls? The layout of the per-thread structure used by libthr is private and is not guaranteed to be stable even on the stable branches. Yes, you could obtain the tid this way, but note explicitely that using it makes your application not binary compatible with any version of the FreeBSD except the one you compiled on. Luckily it's for an open source project (Python), so recompilation isn't a big deal. (I also check the intrinsic result versus the syscall result during startup to verify the same ID is returned, falling back to the syscall by default.) For you, may be. For your users, it definitely will be a problem. And worse, the problem will be blamed on the operating system and not to the broken application. You could read the _thread_off_tid integer variable and use the value as offset from the %fs base to the long containing the unique thread id. But don't use this in anything except the private code. Ah, thanks, that's what I was interested in knowing. (I technically don't need the thread ID, I just need to get some form of unique identifier for the current thread such that I can compare it to a known global value that's been set to the main thread, in order to determine if I'm currently that thread or not. As long as it's unique for each thread, and static for the lifetime of the thread, that's fine.) The am I the main thread? comparison is made every ~50-100 opcodes, which is why it needs to have the lowest overhead possible. On newer CPUs in amd64 mode, there is getfsbase instruction which reads the %fs register base. System guarantees that %fs base is unique among live threads. Interesting. I was aware of those instructions, but never assessed them in detail once I'd figured out the readgsdword approach. I definitely didn't realize they return unique values per thread (although it makes sense now that I think about it). Thanks Konstantin, very helpful. Trent. pgpm7aloC53Ee.pgp Description: PGP signature
Re: Getting the current thread ID without a syscall?
On 1/15/13 1:43 PM, Konstantin Belousov wrote: On Tue, Jan 15, 2013 at 04:35:14PM -0500, Trent Nelson wrote: Luckily it's for an open source project (Python), so recompilation isn't a big deal. (I also check the intrinsic result versus the syscall result during startup to verify the same ID is returned, falling back to the syscall by default.) For you, may be. For your users, it definitely will be a problem. And worse, the problem will be blamed on the operating system and not to the broken application. Anything we can do to avoid this would be best. The reason is that we are still dealing with an optimization that perl did, it reached inside of the opaque struct FILE to do nasty things. Now it is very difficult for us to fix struct FILE. We are still paying for this years later. Any way we can make this a supported interface? -Alfred ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Getting the current thread ID without a syscall?
On Tue, 2013-01-15 at 14:29 -0800, Alfred Perlstein wrote: On 1/15/13 1:43 PM, Konstantin Belousov wrote: On Tue, Jan 15, 2013 at 04:35:14PM -0500, Trent Nelson wrote: Luckily it's for an open source project (Python), so recompilation isn't a big deal. (I also check the intrinsic result versus the syscall result during startup to verify the same ID is returned, falling back to the syscall by default.) For you, may be. For your users, it definitely will be a problem. And worse, the problem will be blamed on the operating system and not to the broken application. Anything we can do to avoid this would be best. The reason is that we are still dealing with an optimization that perl did, it reached inside of the opaque struct FILE to do nasty things. Now it is very difficult for us to fix struct FILE. We are still paying for this years later. Any way we can make this a supported interface? -Alfred Re-reading the original question, I've got to ask why pthread_self() isn't the right answer? The requirement wasn't I need to know what the OS calls me it was I need a unique ID per thread within a process. -- Ian ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Getting the current thread ID without a syscall?
On Tue, Jan 15, 2013 at 02:33:41PM -0800, Ian Lepore wrote: On Tue, 2013-01-15 at 14:29 -0800, Alfred Perlstein wrote: On 1/15/13 1:43 PM, Konstantin Belousov wrote: On Tue, Jan 15, 2013 at 04:35:14PM -0500, Trent Nelson wrote: Luckily it's for an open source project (Python), so recompilation isn't a big deal. (I also check the intrinsic result versus the syscall result during startup to verify the same ID is returned, falling back to the syscall by default.) For you, may be. For your users, it definitely will be a problem. And worse, the problem will be blamed on the operating system and not to the broken application. Anything we can do to avoid this would be best. The reason is that we are still dealing with an optimization that perl did, it reached inside of the opaque struct FILE to do nasty things. Now it is very difficult for us to fix struct FILE. We are still paying for this years later. Any way we can make this a supported interface? -Alfred Re-reading the original question, I've got to ask why pthread_self() isn't the right answer? The requirement wasn't I need to know what the OS calls me it was I need a unique ID per thread within a process. The identity check is performed hundreds of times per second. The overhead of (Py_MainThreadId == __readgsdword(0x48) ? A() : B()) is negligible -- I can't say the same for a system/function call. (I'm experimenting with an idea I had to parallelize Python such that it can exploit all cores without impeding the performance of normal single-threaded execution (like previous-GIL-removal attempts and STM). It's very promising so far -- presuming we can get the current thread ID in a couple of instructions. If not, single-threaded performance suffers too much.) Trent. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Getting the current thread ID without a syscall?
On Tue, Jan 15, 2013 at 06:03:30PM -0500, Trent Nelson wrote: On Tue, Jan 15, 2013 at 02:33:41PM -0800, Ian Lepore wrote: On Tue, 2013-01-15 at 14:29 -0800, Alfred Perlstein wrote: On 1/15/13 1:43 PM, Konstantin Belousov wrote: On Tue, Jan 15, 2013 at 04:35:14PM -0500, Trent Nelson wrote: Luckily it's for an open source project (Python), so recompilation isn't a big deal. (I also check the intrinsic result versus the syscall result during startup to verify the same ID is returned, falling back to the syscall by default.) For you, may be. For your users, it definitely will be a problem. And worse, the problem will be blamed on the operating system and not to the broken application. Anything we can do to avoid this would be best. The reason is that we are still dealing with an optimization that perl did, it reached inside of the opaque struct FILE to do nasty things. Now it is very difficult for us to fix struct FILE. We are still paying for this years later. Any way we can make this a supported interface? -Alfred Re-reading the original question, I've got to ask why pthread_self() isn't the right answer? The requirement wasn't I need to know what the OS calls me it was I need a unique ID per thread within a process. The identity check is performed hundreds of times per second. The overhead of (Py_MainThreadId == __readgsdword(0x48) ? A() : B()) is negligible -- I can't say the same for a system/function call. (I'm experimenting with an idea I had to parallelize Python such that it can exploit all cores without impeding the performance of normal single-threaded execution (like previous-GIL-removal attempts and STM). It's very promising so far -- presuming we can get the current thread ID in a couple of instructions. If not, single-threaded performance suffers too much.) If the only thing you ever need is to get a unique handle for the current thread, without the requirement that it corresponds to any other identifier, everything becomes much easier. On amd64, use 'movq %fs:0,%register', on i386 'movl %gs:0,%register'. This instruction is guaranteed to return the thread-unique address of the tcb. See an article about ELF TLS for more details. Even better, this instruction is portable among all ELF Unixes which support TLS. pgpnI6yuTRWOO.pgp Description: PGP signature