Re: Getting the current thread ID without a syscall?

2013-01-18 Thread Julian Elischer

On 1/15/13 4:03 PM, Trent Nelson wrote:

On Tue, Jan 15, 2013 at 02:33:41PM -0800, Ian Lepore wrote:

On Tue, 2013-01-15 at 14:29 -0800, Alfred Perlstein wrote:

On 1/15/13 1:43 PM, Konstantin Belousov wrote:

On Tue, Jan 15, 2013 at 04:35:14PM -0500, Trent Nelson wrote:

  Luckily it's for an open source project (Python), so recompilation
  isn't a big deal.  (I also check the intrinsic result versus the
  syscall result during startup to verify the same ID is returned,
  falling back to the syscall by default.)

For you, may be. For your users, it definitely will be a problem.
And worse, the problem will be blamed on the operating system and not
to the broken application.


Anything we can do to avoid this would be best.

The reason is that we are still dealing with an optimization that perl
did, it reached inside of the opaque struct FILE to do nasty things.
Now it is very difficult for us to fix struct FILE.

We are still paying for this years later.

Any way we can make this a supported interface?

-Alfred

Re-reading the original question, I've got to ask why pthread_self()
isn't the right answer?  The requirement wasn't I need to know what the
OS calls me it was I need a unique ID per thread within a process.

 The identity check is performed hundreds of times per second.  The
 overhead of (Py_MainThreadId == __readgsdword(0x48) ? A() : B()) is
 negligible -- I can't say the same for a system/function call.

 (I'm experimenting with an idea I had to parallelize Python such
  that it can exploit all cores without impeding the performance
  of normal single-threaded execution (like previous-GIL-removal
  attempts and STM).  It's very promising so far -- presuming we
  can get the current thread ID in a couple of instructions.  If
  not, single-threaded performance suffers too much.)


TLS?



 Trent.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org



___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Getting the current thread ID without a syscall?

2013-01-15 Thread Trent Nelson
Howdy,

I have an unusual requirement: I need to get the current thread ID
in as few instructions as possible.  On Windows, I managed to come
up with this glorious hack:

#ifdef WITH_INTRINSICS
#   ifdef MS_WINDOWS
#   include intrin.h
#   if defined(MS_WIN64)
#   pragma intrinsic(__readgsdword)
#   define _Py_get_current_process_id() (__readgsdword(0x40))
#   define _Py_get_current_thread_id()  (__readgsdword(0x48))
#   elif defined(MS_WIN32)
#   pragma intrinsic(__readfsdword)
#   define _Py_get_current_process_id() (__readfsdword(0x20))
#   define _Py_get_current_thread_id()  (__readfsdword(0x24))

That exploits the fact that Windows uses the FS/GS registers to
store thread/process metadata.  Could I use a similar approach on
FreeBSD to get the thread ID without the need for syscalls?

(I technically don't need the thread ID, I just need to get some
 form of unique identifier for the current thread such that I can
 compare it to a known global value that's been set to the main
 thread, in order to determine if I'm currently that thread or not.
 As long as it's unique for each thread, and static for the lifetime
 of the thread, that's fine.)

The am I the main thread? comparison is made every ~50-100 opcodes,
which is why it needs to have the lowest overhead possible.

Regards,

Trent.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Getting the current thread ID without a syscall?

2013-01-15 Thread Konstantin Belousov
On Tue, Jan 15, 2013 at 03:54:03PM -0500, Trent Nelson wrote:
 Howdy,
 
 I have an unusual requirement: I need to get the current thread ID
 in as few instructions as possible.  On Windows, I managed to come
 up with this glorious hack:
 
 #ifdef WITH_INTRINSICS
 #   ifdef MS_WINDOWS
 #   include intrin.h
 #   if defined(MS_WIN64)
 #   pragma intrinsic(__readgsdword)
 #   define _Py_get_current_process_id() (__readgsdword(0x40))
 #   define _Py_get_current_thread_id()  (__readgsdword(0x48))
 #   elif defined(MS_WIN32)
 #   pragma intrinsic(__readfsdword)
 #   define _Py_get_current_process_id() (__readfsdword(0x20))
 #   define _Py_get_current_thread_id()  (__readfsdword(0x24))
 
 That exploits the fact that Windows uses the FS/GS registers to
 store thread/process metadata.  Could I use a similar approach on
 FreeBSD to get the thread ID without the need for syscalls?
The layout of the per-thread structure used by libthr is private and
is not guaranteed to be stable even on the stable branches.

Yes, you could obtain the tid this way, but note explicitely that using
it makes your application not binary compatible with any version of
the FreeBSD except the one you compiled on.

You could read the _thread_off_tid integer variable and use the value
as offset from the %fs base to the long containing the unique thread id.
But don't use this in anything except the private code.

 
 (I technically don't need the thread ID, I just need to get some
  form of unique identifier for the current thread such that I can
  compare it to a known global value that's been set to the main
  thread, in order to determine if I'm currently that thread or not.
  As long as it's unique for each thread, and static for the lifetime
  of the thread, that's fine.)
 
 The am I the main thread? comparison is made every ~50-100 opcodes,
 which is why it needs to have the lowest overhead possible.
On newer CPUs in amd64 mode, there is getfsbase instruction which reads
the %fs register base. System guarantees that %fs base is unique among
live threads.


pgpsp_lioKIEy.pgp
Description: PGP signature


Re: Getting the current thread ID without a syscall?

2013-01-15 Thread Trent Nelson
On Tue, Jan 15, 2013 at 01:16:41PM -0800, Konstantin Belousov wrote:
 On Tue, Jan 15, 2013 at 03:54:03PM -0500, Trent Nelson wrote:
  Howdy,
  
  I have an unusual requirement: I need to get the current thread ID
  in as few instructions as possible.  On Windows, I managed to come
  up with this glorious hack:
  
  #ifdef WITH_INTRINSICS
  #   ifdef MS_WINDOWS
  #   include intrin.h
  #   if defined(MS_WIN64)
  #   pragma intrinsic(__readgsdword)
  #   define _Py_get_current_process_id() (__readgsdword(0x40))
  #   define _Py_get_current_thread_id()  (__readgsdword(0x48))
  #   elif defined(MS_WIN32)
  #   pragma intrinsic(__readfsdword)
  #   define _Py_get_current_process_id() (__readfsdword(0x20))
  #   define _Py_get_current_thread_id()  (__readfsdword(0x24))
  
  That exploits the fact that Windows uses the FS/GS registers to
  store thread/process metadata.  Could I use a similar approach on
  FreeBSD to get the thread ID without the need for syscalls?
 The layout of the per-thread structure used by libthr is private and
 is not guaranteed to be stable even on the stable branches.
 
 Yes, you could obtain the tid this way, but note explicitely that using
 it makes your application not binary compatible with any version of
 the FreeBSD except the one you compiled on.

Luckily it's for an open source project (Python), so recompilation
isn't a big deal.  (I also check the intrinsic result versus the
syscall result during startup to verify the same ID is returned,
falling back to the syscall by default.)

 You could read the _thread_off_tid integer variable and use the value
 as offset from the %fs base to the long containing the unique thread id.
 But don't use this in anything except the private code.

Ah, thanks, that's what I was interested in knowing.

  
  (I technically don't need the thread ID, I just need to get some
   form of unique identifier for the current thread such that I can
   compare it to a known global value that's been set to the main
   thread, in order to determine if I'm currently that thread or not.
   As long as it's unique for each thread, and static for the lifetime
   of the thread, that's fine.)
  
  The am I the main thread? comparison is made every ~50-100 opcodes,
  which is why it needs to have the lowest overhead possible.

 On newer CPUs in amd64 mode, there is getfsbase instruction which reads
 the %fs register base. System guarantees that %fs base is unique among
 live threads.

Interesting.  I was aware of those instructions, but never assessed
them in detail once I'd figured out the readgsdword approach.  I
definitely didn't realize they return unique values per thread
(although it makes sense now that I think about it).

Thanks Konstantin, very helpful.

Trent.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Getting the current thread ID without a syscall?

2013-01-15 Thread Konstantin Belousov
On Tue, Jan 15, 2013 at 04:35:14PM -0500, Trent Nelson wrote:
 On Tue, Jan 15, 2013 at 01:16:41PM -0800, Konstantin Belousov wrote:
  On Tue, Jan 15, 2013 at 03:54:03PM -0500, Trent Nelson wrote:
   Howdy,
   
   I have an unusual requirement: I need to get the current thread ID
   in as few instructions as possible.  On Windows, I managed to come
   up with this glorious hack:
   
   #ifdef WITH_INTRINSICS
   #   ifdef MS_WINDOWS
   #   include intrin.h
   #   if defined(MS_WIN64)
   #   pragma intrinsic(__readgsdword)
   #   define _Py_get_current_process_id() (__readgsdword(0x40))
   #   define _Py_get_current_thread_id()  (__readgsdword(0x48))
   #   elif defined(MS_WIN32)
   #   pragma intrinsic(__readfsdword)
   #   define _Py_get_current_process_id() (__readfsdword(0x20))
   #   define _Py_get_current_thread_id()  (__readfsdword(0x24))
   
   That exploits the fact that Windows uses the FS/GS registers to
   store thread/process metadata.  Could I use a similar approach on
   FreeBSD to get the thread ID without the need for syscalls?
  The layout of the per-thread structure used by libthr is private and
  is not guaranteed to be stable even on the stable branches.
  
  Yes, you could obtain the tid this way, but note explicitely that using
  it makes your application not binary compatible with any version of
  the FreeBSD except the one you compiled on.
 
 Luckily it's for an open source project (Python), so recompilation
 isn't a big deal.  (I also check the intrinsic result versus the
 syscall result during startup to verify the same ID is returned,
 falling back to the syscall by default.)
For you, may be. For your users, it definitely will be a problem.
And worse, the problem will be blamed on the operating system and not
to the broken application.

 
  You could read the _thread_off_tid integer variable and use the value
  as offset from the %fs base to the long containing the unique thread id.
  But don't use this in anything except the private code.
 
 Ah, thanks, that's what I was interested in knowing.
 
   
   (I technically don't need the thread ID, I just need to get some
form of unique identifier for the current thread such that I can
compare it to a known global value that's been set to the main
thread, in order to determine if I'm currently that thread or not.
As long as it's unique for each thread, and static for the lifetime
of the thread, that's fine.)
   
   The am I the main thread? comparison is made every ~50-100 opcodes,
   which is why it needs to have the lowest overhead possible.
 
  On newer CPUs in amd64 mode, there is getfsbase instruction which reads
  the %fs register base. System guarantees that %fs base is unique among
  live threads.
 
 Interesting.  I was aware of those instructions, but never assessed
 them in detail once I'd figured out the readgsdword approach.  I
 definitely didn't realize they return unique values per thread
 (although it makes sense now that I think about it).
 
 Thanks Konstantin, very helpful.
 
 Trent.


pgpm7aloC53Ee.pgp
Description: PGP signature


Re: Getting the current thread ID without a syscall?

2013-01-15 Thread Alfred Perlstein

On 1/15/13 1:43 PM, Konstantin Belousov wrote:

On Tue, Jan 15, 2013 at 04:35:14PM -0500, Trent Nelson wrote:


 Luckily it's for an open source project (Python), so recompilation
 isn't a big deal.  (I also check the intrinsic result versus the
 syscall result during startup to verify the same ID is returned,
 falling back to the syscall by default.)

For you, may be. For your users, it definitely will be a problem.
And worse, the problem will be blamed on the operating system and not
to the broken application.


Anything we can do to avoid this would be best.

The reason is that we are still dealing with an optimization that perl 
did, it reached inside of the opaque struct FILE to do nasty things.  
Now it is very difficult for us to fix struct FILE.


We are still paying for this years later.

Any way we can make this a supported interface?

-Alfred


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Getting the current thread ID without a syscall?

2013-01-15 Thread Ian Lepore
On Tue, 2013-01-15 at 14:29 -0800, Alfred Perlstein wrote:
 On 1/15/13 1:43 PM, Konstantin Belousov wrote:
  On Tue, Jan 15, 2013 at 04:35:14PM -0500, Trent Nelson wrote:
 
   Luckily it's for an open source project (Python), so recompilation
   isn't a big deal.  (I also check the intrinsic result versus the
   syscall result during startup to verify the same ID is returned,
   falling back to the syscall by default.)
  For you, may be. For your users, it definitely will be a problem.
  And worse, the problem will be blamed on the operating system and not
  to the broken application.
 
 Anything we can do to avoid this would be best.
 
 The reason is that we are still dealing with an optimization that perl 
 did, it reached inside of the opaque struct FILE to do nasty things.  
 Now it is very difficult for us to fix struct FILE.
 
 We are still paying for this years later.
 
 Any way we can make this a supported interface?
 
 -Alfred

Re-reading the original question, I've got to ask why pthread_self()
isn't the right answer?  The requirement wasn't I need to know what the
OS calls me it was I need a unique ID per thread within a process.

-- Ian


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Getting the current thread ID without a syscall?

2013-01-15 Thread Trent Nelson
On Tue, Jan 15, 2013 at 02:33:41PM -0800, Ian Lepore wrote:
 On Tue, 2013-01-15 at 14:29 -0800, Alfred Perlstein wrote:
  On 1/15/13 1:43 PM, Konstantin Belousov wrote:
   On Tue, Jan 15, 2013 at 04:35:14PM -0500, Trent Nelson wrote:
  
Luckily it's for an open source project (Python), so recompilation
isn't a big deal.  (I also check the intrinsic result versus the
syscall result during startup to verify the same ID is returned,
falling back to the syscall by default.)
   For you, may be. For your users, it definitely will be a problem.
   And worse, the problem will be blamed on the operating system and not
   to the broken application.
  
  Anything we can do to avoid this would be best.
  
  The reason is that we are still dealing with an optimization that perl 
  did, it reached inside of the opaque struct FILE to do nasty things.  
  Now it is very difficult for us to fix struct FILE.
  
  We are still paying for this years later.
  
  Any way we can make this a supported interface?
  
  -Alfred
 
 Re-reading the original question, I've got to ask why pthread_self()
 isn't the right answer?  The requirement wasn't I need to know what the
 OS calls me it was I need a unique ID per thread within a process.

The identity check is performed hundreds of times per second.  The
overhead of (Py_MainThreadId == __readgsdword(0x48) ? A() : B()) is
negligible -- I can't say the same for a system/function call.

(I'm experimenting with an idea I had to parallelize Python such
 that it can exploit all cores without impeding the performance
 of normal single-threaded execution (like previous-GIL-removal
 attempts and STM).  It's very promising so far -- presuming we
 can get the current thread ID in a couple of instructions.  If
 not, single-threaded performance suffers too much.)

Trent.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Getting the current thread ID without a syscall?

2013-01-15 Thread Konstantin Belousov
On Tue, Jan 15, 2013 at 06:03:30PM -0500, Trent Nelson wrote:
 On Tue, Jan 15, 2013 at 02:33:41PM -0800, Ian Lepore wrote:
  On Tue, 2013-01-15 at 14:29 -0800, Alfred Perlstein wrote:
   On 1/15/13 1:43 PM, Konstantin Belousov wrote:
On Tue, Jan 15, 2013 at 04:35:14PM -0500, Trent Nelson wrote:
   
 Luckily it's for an open source project (Python), so recompilation
 isn't a big deal.  (I also check the intrinsic result versus the
 syscall result during startup to verify the same ID is returned,
 falling back to the syscall by default.)
For you, may be. For your users, it definitely will be a problem.
And worse, the problem will be blamed on the operating system and not
to the broken application.
   
   Anything we can do to avoid this would be best.
   
   The reason is that we are still dealing with an optimization that perl 
   did, it reached inside of the opaque struct FILE to do nasty things.  
   Now it is very difficult for us to fix struct FILE.
   
   We are still paying for this years later.
   
   Any way we can make this a supported interface?
   
   -Alfred
  
  Re-reading the original question, I've got to ask why pthread_self()
  isn't the right answer?  The requirement wasn't I need to know what the
  OS calls me it was I need a unique ID per thread within a process.
 
 The identity check is performed hundreds of times per second.  The
 overhead of (Py_MainThreadId == __readgsdword(0x48) ? A() : B()) is
 negligible -- I can't say the same for a system/function call.
 
 (I'm experimenting with an idea I had to parallelize Python such
  that it can exploit all cores without impeding the performance
  of normal single-threaded execution (like previous-GIL-removal
  attempts and STM).  It's very promising so far -- presuming we
  can get the current thread ID in a couple of instructions.  If
  not, single-threaded performance suffers too much.)

If the only thing you ever need is to get a unique handle for the
current thread, without the requirement that it corresponds to any
other identifier, everything becomes much easier.

On amd64, use 'movq %fs:0,%register', on i386 'movl %gs:0,%register'.
This instruction is guaranteed to return the thread-unique address
of the tcb. See an article about ELF TLS for more details.

Even better, this instruction is portable among all ELF Unixes which
support TLS.


pgpnI6yuTRWOO.pgp
Description: PGP signature