Why is TUNABLE_INT discouraged?

2010-08-06 Thread Garrett Cooper
Hi Hackers,
Poking around the sound(4) drivers, I looked into converting one
of the TUNABLE_INTs to an unsigned tunable for testing purposes. I
looked in kernel.h and I saw the following comment:

/*
 * int
 * please avoid using for new tunables!
 */

   I found the commit where it was made (by des@ -- cvs revision
1.120), but unfortunately I lack the context as to why that suggestion
is made; the commit isn't very explicit as to why integers tunables
should be discouraged -- and in the case of
hw.sound.feeder_rate_round, it makes just as much sense to use an
integer type or a long integer type, as accepted input values are
small enough to fit in an integer value with a _lot_ of headroom:

 hw.snd.feeder_rate_round
 Sample rate rounding threshold, to avoid large prime division at
 the cost of accuracy.  All requested sample rates will be rounded
 to the nearest threshold value.  Possible values range between 0
 (disabled) and 500.  Default is 25.

It would be nice to know why TUNABLE_INT is discouraged :).
Thanks!
-Garrett
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: sched_pin() versus PCPU_GET

2010-08-06 Thread Attilio Rao
2010/8/5 John Baldwin :
> On Thursday, August 05, 2010 11:59:37 am m...@freebsd.org wrote:
>> On Wed, Aug 4, 2010 at 11:55 AM, John Baldwin  wrote:
>> > On Wednesday, August 04, 2010 12:20:31 pm m...@freebsd.org wrote:
>> >> On Wed, Aug 4, 2010 at 2:26 PM, John Baldwin  wrote:
>> >> > On Tuesday, August 03, 2010 9:46:16 pm m...@freebsd.org wrote:
>> >> >> On Fri, Jul 30, 2010 at 2:31 PM, John Baldwin  wrote:
>> >> >> > On Friday, July 30, 2010 10:08:22 am John Baldwin wrote:
>> >> >> >> On Thursday, July 29, 2010 7:39:02 pm m...@freebsd.org wrote:
>> >> >> >> > We've seen a few instances at work where witness_warn() in ast()
>> >> >> >> > indicates the sched lock is still held, but the place it claims
> it was
>> >> >> >> > held by is in fact sometimes not possible to keep the lock, like:
>> >> >> >> >
>> >> >> >> >     thread_lock(td);
>> >> >> >> >     td->td_flags &= ~TDF_SELECT;
>> >> >> >> >     thread_unlock(td);
>> >> >> >> >
>> >> >> >> > What I was wondering is, even though the assembly I see in
> objdump -S
>> >> >> >> > for witness_warn has the increment of td_pinned before the
> PCPU_GET:
>> >> >> >> >
>> >> >> >> > 802db210:   65 48 8b 1c 25 00 00    mov    %gs:0x0,%rbx
>> >> >> >> > 802db217:   00 00
>> >> >> >> > 802db219:   ff 83 04 01 00 00       incl   0x104(%rbx)
>> >> >> >> >      * Pin the thread in order to avoid problems with thread
> migration.
>> >> >> >> >      * Once that all verifies are passed about spinlocks
> ownership,
>> >> >> >> >      * the thread is in a safe path and it can be unpinned.
>> >> >> >> >      */
>> >> >> >> >     sched_pin();
>> >> >> >> >     lock_list = PCPU_GET(spinlocks);
>> >> >> >> > 802db21f:   65 48 8b 04 25 48 00    mov    %gs:0x48,%rax
>> >> >> >> > 802db226:   00 00
>> >> >> >> >     if (lock_list != NULL && lock_list->ll_count != 0) {
>> >> >> >> > 802db228:   48 85 c0                test   %rax,%rax
>> >> >> >> >      * Pin the thread in order to avoid problems with thread
> migration.
>> >> >> >> >      * Once that all verifies are passed about spinlocks
> ownership,
>> >> >> >> >      * the thread is in a safe path and it can be unpinned.
>> >> >> >> >      */
>> >> >> >> >     sched_pin();
>> >> >> >> >     lock_list = PCPU_GET(spinlocks);
>> >> >> >> > 802db22b:   48 89 85 f0 fe ff ff    mov
>  %rax,-0x110(%rbp)
>> >> >> >> > 802db232:   48 89 85 f8 fe ff ff    mov
>  %rax,-0x108(%rbp)
>> >> >> >> >     if (lock_list != NULL && lock_list->ll_count != 0) {
>> >> >> >> > 802db239:   0f 84 ff 00 00 00       je
> 802db33e
>> >> >> >> > 
>> >> >> >> > 802db23f:   44 8b 60 50             mov    0x50(%rax),
> %r12d
>> >> >> >> >
>> >> >> >> > is it possible for the hardware to do any re-ordering here?
>> >> >> >> >
>> >> >> >> > The reason I'm suspicious is not just that the code doesn't have
> a
>> >> >> >> > lock leak at the indicated point, but in one instance I can see
> in the
>> >> >> >> > dump that the lock_list local from witness_warn is from the pcpu
>> >> >> >> > structure for CPU 0 (and I was warned about sched lock 0), but
> the
>> >> >> >> > thread id in panic_cpu is 2.  So clearly the thread was being
> migrated
>> >> >> >> > right around panic time.
>> >> >> >> >
>> >> >> >> > This is the amd64 kernel on stable/7.  I'm not sure exactly what
> kind
>> >> >> >> > of hardware; it's a 4-way Intel chip from about 3 or 4 years ago
> IIRC.
>> >> >> >> >
>> >> >> >> > So... do we need some kind of barrier in the code for sched_pin()
> for
>> >> >> >> > it to really do what it claims?  Could the hardware have re-
> ordered
>> >> >> >> > the "mov    %gs:0x48,%rax" PCPU_GET to before the sched_pin()
>> >> >> >> > increment?
>> >> >> >>
>> >> >> >> Hmmm, I think it might be able to because they refer to different
> locations.
>> >> >> >>
>> >> >> >> Note this rule in section 8.2.2 of Volume 3A:
>> >> >> >>
>> >> >> >>   • Reads may be reordered with older writes to different locations
> but not
>> >> >> >>     with older writes to the same location.
>> >> >> >>
>> >> >> >> It is certainly true that sparc64 could reorder with RMO.  I
> believe ia64
>> >> >> >> could reorder as well.  Since sched_pin/unpin are frequently used
> to provide
>> >> >> >> this sort of synchronization, we could use memory barriers in
> pin/unpin
>> >> >> >> like so:
>> >> >> >>
>> >> >> >> sched_pin()
>> >> >> >> {
>> >> >> >>       td->td_pinned = atomic_load_acq_int(&td->td_pinned) + 1;
>> >> >> >> }
>> >> >> >>
>> >> >> >> sched_unpin()
>> >> >> >> {
>> >> >> >>       atomic_store_rel_int(&td->td_pinned, td->td_pinned - 1);
>> >> >> >> }
>> >> >> >>
>> >> >> >> We could also just use atomic_add_acq_int() and
> atomic_sub_rel_int(), but they
>> >> >> >> are slightly more heavyweight, though it would be more clear what
> is happening
>> >> >> >> I think.
>> >> >> >
>> >> >> > However, to actually get a race you'd have to have an interrupt fire
> and
>> >> >> > migrate you so that t

Re: Avoiding sysctl at program startup using ELF aux vector (was: concurrent sysctl implementation)

2010-08-06 Thread Marius Strobl
On Fri, Aug 06, 2010 at 12:04:04PM +0300, Kostik Belousov wrote:
> On Fri, Aug 06, 2010 at 07:06:33AM +0200, Jeremie Le Hen wrote:
> > Hi Kib,
> > 
> > In-Reply-To: <20100629083901.gg13...@deviant.kiev.zoral.com.ua>
> > On Tue, Jun 29, 2010 at 11:39:01AM +0300, Kostik Belousov wrote:
> > > On Tue, Jun 29, 2010 at 10:26:39AM +0200, Marius Strobl wrote:
> > > > On Mon, Jun 28, 2010 at 05:48:59PM +0300, Kostik Belousov wrote:
> > > > > On Wed, Jun 23, 2010 at 11:09:59PM +0200, Jeremie Le Hen wrote:
> > > > > > Hi Kostik,
> > > > > > 
> > > > > > This patch seems to have faded out from memory.  Is it possible to 
> > > > > > go
> > > > > > forward and commit it?
> > > > > I refreshed the patch. Hopefully, nobody will object, and I commit it
> > > > > shortly.
> > > > > 
> > > > > > 
> > > > > > Thanks,
> > > > > > Regards.
> > > > > > 
> > > > > > On Sat, Jul 25, 2009 at 12:29:16AM +0300, Kostik Belousov wrote:
> > > > > > > Below is the prototype that seems to work for me both with 
> > > > > > > patched and
> > > > > > > old rtld on i386. Patch also contains bits for amd64 that I did 
> > > > > > > not
> > > > > > > tested yet. All other arches are not buildable for now.
> > > > > > > 
> > > > > > > Patch completely eliminates sysctl syscalls from the rtld and libc
> > > > > > > startup. Without the patch, a single run of /bin/ls did 6 sysctls,
> > > > > > > with the patch, no sysctls is queried at all.
> > > > > > > 
> > > > > Comparing with the originally posted patch, I added support for all
> > > > > architectures, tested amd64 and ia32 on amd64, and converted 
> > > > > getpagesizes(3)
> > > > > that added two more startup sysctls.
> > > > > 
> > > > > Would be nice to get a testing for at least some !x86 architectures
> > > > > before the commit, I added some people who helped me in past, to the 
> > > > > Cc:.
> > > > > 
> > > > 
> > > > Doesn't look good on sparc64:
> > > > <...>
> > > > NFS ROOT: 192.168.1.40:/usr/data/nfsroot/sparc64
> > > > dc1: link state changed to UP
> > > > pid 24 (ifconfig), uid 0: exited on signal 11
> > > > Segmentation fault
> > > > Interface  IP-Address  Broadcast
> > > > pid 29 (rcorder), uid 0: exited on signal 11
> > > > Segmentation fault
> > > > pid 30 (grep), uid 0: exited on signal 11
> > > > Segmentation fault
> > > > pid 31 (rcorder), uid 0: exited on signal 11
> > > > Segmentation fault
> > > >  
> > > > pid 32 (date), uid 0: exited on signal 11
> > > > Segmentation fault
> > > > Jun 29 12:20:50 getty[36]: open /dev/ttyv3: No such file or directory
> > > > <...>
> > > > 
> > > > Unfortunately, I currently lack the time to debug this.
> > > 
> > > Thank you.
> > 
> > Did yu have time to look at this problem?  It would be nice to have this
> > in the tree.
> 
> I cannot move forward without the help from somebody having access to
> sparc64 system where the problem is reproducable.

Do you have a debug version of the patch which outputs the necessary
information?

Marius

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Avoiding sysctl at program startup using ELF aux vector (was: concurrent sysctl implementation)

2010-08-06 Thread Kostik Belousov
On Fri, Aug 06, 2010 at 01:08:08PM +0200, Marius Strobl wrote:
> On Fri, Aug 06, 2010 at 12:04:04PM +0300, Kostik Belousov wrote:
> > On Fri, Aug 06, 2010 at 07:06:33AM +0200, Jeremie Le Hen wrote:
> > > Hi Kib,
> > > 
> > > In-Reply-To: <20100629083901.gg13...@deviant.kiev.zoral.com.ua>
> > > On Tue, Jun 29, 2010 at 11:39:01AM +0300, Kostik Belousov wrote:
> > > > On Tue, Jun 29, 2010 at 10:26:39AM +0200, Marius Strobl wrote:
> > > > > On Mon, Jun 28, 2010 at 05:48:59PM +0300, Kostik Belousov wrote:
> > > > > > On Wed, Jun 23, 2010 at 11:09:59PM +0200, Jeremie Le Hen wrote:
> > > > > > > Hi Kostik,
> > > > > > > 
> > > > > > > This patch seems to have faded out from memory.  Is it possible 
> > > > > > > to go
> > > > > > > forward and commit it?
> > > > > > I refreshed the patch. Hopefully, nobody will object, and I commit 
> > > > > > it
> > > > > > shortly.
> > > > > > 
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > Regards.
> > > > > > > 
> > > > > > > On Sat, Jul 25, 2009 at 12:29:16AM +0300, Kostik Belousov wrote:
> > > > > > > > Below is the prototype that seems to work for me both with 
> > > > > > > > patched and
> > > > > > > > old rtld on i386. Patch also contains bits for amd64 that I did 
> > > > > > > > not
> > > > > > > > tested yet. All other arches are not buildable for now.
> > > > > > > > 
> > > > > > > > Patch completely eliminates sysctl syscalls from the rtld and 
> > > > > > > > libc
> > > > > > > > startup. Without the patch, a single run of /bin/ls did 6 
> > > > > > > > sysctls,
> > > > > > > > with the patch, no sysctls is queried at all.
> > > > > > > > 
> > > > > > Comparing with the originally posted patch, I added support for all
> > > > > > architectures, tested amd64 and ia32 on amd64, and converted 
> > > > > > getpagesizes(3)
> > > > > > that added two more startup sysctls.
> > > > > > 
> > > > > > Would be nice to get a testing for at least some !x86 architectures
> > > > > > before the commit, I added some people who helped me in past, to 
> > > > > > the Cc:.
> > > > > > 
> > > > > 
> > > > > Doesn't look good on sparc64:
> > > > > <...>
> > > > > NFS ROOT: 192.168.1.40:/usr/data/nfsroot/sparc64
> > > > > dc1: link state changed to UP
> > > > > pid 24 (ifconfig), uid 0: exited on signal 11
> > > > > Segmentation fault
> > > > > Interface  IP-Address  Broadcast
> > > > > pid 29 (rcorder), uid 0: exited on signal 11
> > > > > Segmentation fault
> > > > > pid 30 (grep), uid 0: exited on signal 11
> > > > > Segmentation fault
> > > > > pid 31 (rcorder), uid 0: exited on signal 11
> > > > > Segmentation fault
> > > > >  
> > > > > pid 32 (date), uid 0: exited on signal 11
> > > > > Segmentation fault
> > > > > Jun 29 12:20:50 getty[36]: open /dev/ttyv3: No such file or directory
> > > > > <...>
> > > > > 
> > > > > Unfortunately, I currently lack the time to debug this.
> > > > 
> > > > Thank you.
> > > 
> > > Did yu have time to look at this problem?  It would be nice to have this
> > > in the tree.
> > 
> > I cannot move forward without the help from somebody having access to
> > sparc64 system where the problem is reproducable.
> 
> Do you have a debug version of the patch which outputs the necessary
> information?

I would suggest to build rtld and libc with debugging symbols and
get full backtrace from the faults.


pgpsfXFxdD7Ze.pgp
Description: PGP signature


Re: Avoiding sysctl at program startup using ELF aux vector (was: concurrent sysctl implementation)

2010-08-06 Thread Kostik Belousov
On Fri, Aug 06, 2010 at 07:06:33AM +0200, Jeremie Le Hen wrote:
> Hi Kib,
> 
> In-Reply-To: <20100629083901.gg13...@deviant.kiev.zoral.com.ua>
> On Tue, Jun 29, 2010 at 11:39:01AM +0300, Kostik Belousov wrote:
> > On Tue, Jun 29, 2010 at 10:26:39AM +0200, Marius Strobl wrote:
> > > On Mon, Jun 28, 2010 at 05:48:59PM +0300, Kostik Belousov wrote:
> > > > On Wed, Jun 23, 2010 at 11:09:59PM +0200, Jeremie Le Hen wrote:
> > > > > Hi Kostik,
> > > > > 
> > > > > This patch seems to have faded out from memory.  Is it possible to go
> > > > > forward and commit it?
> > > > I refreshed the patch. Hopefully, nobody will object, and I commit it
> > > > shortly.
> > > > 
> > > > > 
> > > > > Thanks,
> > > > > Regards.
> > > > > 
> > > > > On Sat, Jul 25, 2009 at 12:29:16AM +0300, Kostik Belousov wrote:
> > > > > > Below is the prototype that seems to work for me both with patched 
> > > > > > and
> > > > > > old rtld on i386. Patch also contains bits for amd64 that I did not
> > > > > > tested yet. All other arches are not buildable for now.
> > > > > > 
> > > > > > Patch completely eliminates sysctl syscalls from the rtld and libc
> > > > > > startup. Without the patch, a single run of /bin/ls did 6 sysctls,
> > > > > > with the patch, no sysctls is queried at all.
> > > > > > 
> > > > Comparing with the originally posted patch, I added support for all
> > > > architectures, tested amd64 and ia32 on amd64, and converted 
> > > > getpagesizes(3)
> > > > that added two more startup sysctls.
> > > > 
> > > > Would be nice to get a testing for at least some !x86 architectures
> > > > before the commit, I added some people who helped me in past, to the 
> > > > Cc:.
> > > > 
> > > 
> > > Doesn't look good on sparc64:
> > > <...>
> > > NFS ROOT: 192.168.1.40:/usr/data/nfsroot/sparc64
> > > dc1: link state changed to UP
> > > pid 24 (ifconfig), uid 0: exited on signal 11
> > > Segmentation fault
> > > Interface  IP-Address  Broadcast
> > > pid 29 (rcorder), uid 0: exited on signal 11
> > > Segmentation fault
> > > pid 30 (grep), uid 0: exited on signal 11
> > > Segmentation fault
> > > pid 31 (rcorder), uid 0: exited on signal 11
> > > Segmentation fault
> > >  
> > > pid 32 (date), uid 0: exited on signal 11
> > > Segmentation fault
> > > Jun 29 12:20:50 getty[36]: open /dev/ttyv3: No such file or directory
> > > <...>
> > > 
> > > Unfortunately, I currently lack the time to debug this.
> > 
> > Thank you.
> 
> Did yu have time to look at this problem?  It would be nice to have this
> in the tree.

I cannot move forward without the help from somebody having access to
sparc64 system where the problem is reproducable.


pgpF0UGP8pgpd.pgp
Description: PGP signature