Low nfs write throughput

2011-11-17 Thread Daryl Sayers

Can anyone suggest why I am getting poor write performance from my nfs setup.
I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus mother boards,
4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives with
onboard Gb network cards connected to an idle network. The results below show
that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using nfs. It
improves if I use async but a smbfs mount still beats it. I am using the same
file, source and destinations for all tests. I have tried alternate Network
cards with no resulting benefit.

oguido# dd if=/u0/tmp/D2 | rsh castor dd of=/dsk/ufs/D2
1950511+1 records in
1950511+1 records out
998661755 bytes transferred in 10.402483 secs (96002246 bytes/sec)
1950477+74 records in
1950511+1 records out
998661755 bytes transferred in 10.115458 secs (98726301 bytes/sec) (98Mb/s)


oguido# mount -t nfs -o wsize=65536,rsize=65536,tcp gemini:/dsk/ufs /mnt
oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
7619+1 records in
7619+1 records out
998661755 bytes transferred in 62.570260 secs (15960646 bytes/sec) (15Mb/s)


oguido# mount -t nfs -o wsize=65536,rsize=65536,tcp,async gemini:/dsk/ufs /mnt
oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
7619+1 records in
7619+1 records out
998661755 bytes transferred in 50.697024 secs (19698627 bytes/sec) (19Mb/s)


oguido# mount -t smbfs //gemini/ufs /mnt
oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
7619+1 records in
7619+1 records out
998661755 bytes transferred in 29.787616 secs (33526072 bytes/sec) (33Mb/s)

Looking at a systat -v on the destination I see that the nfs test does not
exceed 16KB/t with 100% busy where the other tests reach up to 128KB/t.
For the record I get reads of 22Mb/s without and 77Mb/s with async turned on
for the nfs mount.


A copy of dmesg:


Copyright (c) 1992-2011 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 8.2-STABLE #0: Tue Jul 26 02:49:49 UTC 2011
root@fm32-8-1106:/usr/obj/usr/src/sys/LOCAL i386
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Core(TM)2 Duo CPU E6850  @ 3.00GHz (2995.21-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x6fb  Family = 6  Model = f  Stepping = 11
  
Features=0xbfebfbff
  Features2=0xe3fd
  AMD Features=0x2010
  AMD Features2=0x1
  TSC: P-state invariant
real memory  = 4294967296 (4096 MB)
avail memory = 3141234688 (2995 MB)
ACPI APIC Table: 
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
ioapic0  irqs 0-23 on motherboard
kbd1 at kbdmux0
cryptosoft0:  on motherboard
acpi0:  on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
acpi0: reservation of 0, a (3) failed
acpi0: reservation of 10, bff0 (3) failed
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
cpu0:  on acpi0
ACPI Warning: Incorrect checksum in table [OEMB] - 0xBE, should be 0xB1 
(20101013/tbutils-354)
cpu1:  on acpi0
pcib0:  port 0xcf8-0xcff on acpi0
pci0:  on pcib0
pcib1:  irq 16 at device 1.0 on pci0
pci1:  on pcib1
mpt0:  port 0x7800-0x78ff mem 
0xfd4fc000-0xfd4f,0xfd4e-0xfd4e irq 16 at device 0.0 on pci1
mpt0: [ITHREAD]
mpt0: MPI Version=1.5.18.0
mpt0: Capabilities: ( RAID-0 RAID-1E RAID-1 )
mpt0: 0 Active Volumes (2 Max)
mpt0: 0 Hidden Drive Members (14 Max)
uhci0:  port 0xdc00-0xdc1f irq 16 at 
device 26.0 on pci0
uhci0: [ITHREAD]
uhci0: LegSup = 0x2f00
usbus0:  on uhci0
uhci1:  port 0xe000-0xe01f irq 17 at 
device 26.1 on pci0
uhci1: [ITHREAD]
uhci1: LegSup = 0x2f00
usbus1:  on uhci1
ehci0:  mem 
0xfebffc00-0xfebf irq 18 at device 26.7 on pci0
ehci0: [ITHREAD]
usbus2: EHCI version 1.0
usbus2:  on ehci0
pci0:  at device 27.0 (no driver attached)
pcib2:  irq 16 at device 28.0 on pci0
pci5:  on pcib2
atapci0:  port 0xac00-0xac7f mem 
0xfd9ffc00-0xfd9ffc7f,0xfd9f8000-0xfd9fbfff irq 16 at device 0.0 on pci5
atapci0: [ITHREAD]
ata2:  on atapci0
ata2: [ITHREAD]
ata3:  on atapci0
ata3: [ITHREAD]
pcib3:  irq 17 at device 28.1 on pci0
pci4:  on pcib3
em0:  port 0x9c00-0x9c1f mem 
0xfd7e-0xfd7f,0xfd7c-0xfd7d irq 17 at device 0.0 on pci4
em0: Using an MSI interrupt
em0: [FILTER]
em0: Ethernet address: 00:1b:21:04:ac:11
pcib4:  irq 19 at device 28.3 on pci0
pci3:  on pcib4
age0:  mem 0xfd6c-0xfd6f 
irq 19 at device 0.0 on pci3
age0: 1280 Tx FIFO, 2364 Rx FIFO
age0: Using 1 MSI messages.
miibus0:  on age0
atphy0:  PHY 0 on miibus0
atphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT-FDX, 
1000baseT-FDX-master, auto
age0: Ethernet address: 00:1a:92:d2:de:cc
age0: [FILTER]
pcib5:  irq 16 at device 28.4 on pci0
pci2:  on pcib5
atapci1:  port 
0x8c00-0x8c07,0x8880-0x8883,0x8800-0x8807,0x8480-0x8483,0x84

memory leaks (and some other warning like divison by zero; ) auto reports for FreeBSD source code

2011-11-17 Thread Slono Slono
Hi

This information can be interesting - in most cases really doesn't suffice 
free() and someone is necessary with commit bit who it can to correct. reported 
by cppcheck (http://cppcheck.sourceforge.net/):

This report is actual for FreeBSD 9.0-PRERELEASE

Scan for /usr/src/libexec/:

[rtld-elf/rtld.c:1660]: (error) Resource leak: fd
[ftpd/ftpd.c:610]: (error) Mismatching allocation and deallocation: fd

Scan for /usr/src/lib/

[libarchive/archive_entry_link_resolver.c:240]: (error) Memory leak: le
[libarchive/archive_read_open_filename.c:115]: (error) Resource leak: fd
[libarchive/archive_read_support_format_tar.c:638]: (error) Buffer access 
out-of-bounds: header.magic
[libarchive/archive_write_disk.c:1767]: (error) Memory leak: le
[libc/db/test/btree.tests/main.c:601]: (error) Resource leak: fp
[libc/gen/_pthread_stubs.c:218]: (error) Analysis failed. If the code is valid 
then please report this failure.
[libc/mips/gen/makecontext.c:107]: (error) Uninitialized variable: i
[libc/net/getifaddrs.c:250]: (error) Invalid deallocation
[libc/net/getifaddrs.c:255]: (error) Invalid deallocation
[libc/net/nscache.c:118]: (error) Common realloc mistake: 'buffer' nulled but 
not freed upon failure
[libc/net/nscache.c:204]: (error) Common realloc mistake: 'buffer' nulled but 
not freed upon failure
[libc/net/nscache.c:299]: (error) Common realloc mistake: 'buffer' nulled but 
not freed upon failure
[libc/net/nscache.c:375]: (error) Common realloc mistake: 'buffer' nulled but 
not freed upon failure
[libc/quad/qdivrem.c:100]: (error) Division by zero
[libc/rpc/clnt_perror.c:301]: (error) Allocation with clnt_spcreateerror, 
fprintf doesn't release it.
[libc/rpc/netnamer.c:331]: (error) Resource leak: fd
[libdisk/open_disk.c:89]: (error) Memory leak: d
[libdwarf/dwarf_attr.c:49]: (error) Possible null pointer dereference: at - 
otherwise it is redundant to check if at is null at line 54
[libdwarf/dwarf_init.c:505]: (error) Memory leak: cu
[libedit/readline.c:1009]: (error) Possible null pointer dereference: arr - 
otherwise it is redundant to check if arr is null at line 1006
[libedit/readline.c:1693]: (error) Possible null pointer dereference: pwd - 
otherwise it is redundant to check if pwd is null at line 1695
[libedit/readline.c:1934]: (error) Memory leak: wbuf
[libfetch/file.c:148]: (error) Resource leak: dir
[libgssapi/gss_accept_sec_context.c:217]: (error) Possible null pointer 
dereference: mc - otherwise it is redundant to check if mc is null at line 219
[libgssapi/gss_accept_sec_context.c:220]: (error) Memory leak: ctx
[libgssapi/gss_inquire_cred_by_mech.c:68]: (error) Possible null pointer 
dereference: mcp - otherwise it is redundant to check if mcp is null at line 70
[libgssapi/gss_verify_mic.c:42]: (error) Possible null pointer dereference: ctx 
- otherwise it is redundant to check if ctx is null at line 46
[libgssapi/gss_wrap_size_limit.c:43]: (error) Possible null pointer 
dereference: ctx - otherwise it is redundant to check if ctx is null at line 46
[libjail/jail_getid.c:103]: (error) Uninitialized variable: namebuf
[libmd/mdXhl.c:63]: (error) Resource leak: f
[libncp/ncpl_conn.c:495]: (error) Resource leak: d
[libncp/ncpl_file.c:89]: (error) Resource leak: d
[libprocstat/libprocstat.c:723]: (error) Memory leak: path
[librt/timer.c:106]: (error) Memory leak: timer
[libstand/bzipfs.c:194]: (error) Resource leak: rawfd
[libstand/qdivrem.c:99]: (error) Division by zero

Scan for /usr/src/bin/:

[ps/print.c:427]: (error) Memory leak: buf
[ps/print.c:457]: (error) Memory leak: buf
[sh/jobs.c:825]: (error) Allocation with open, if doesn't release it.
[sh/mknodes.c:269]: (error) Resource leak: hfile
[sh/mknodes.c:269]: (error) Resource leak: patfile
[pax/cache.c:333]: (error) Possible null pointer dereference: ptr - otherwise 
it is redundant to check if ptr is null at line 345
[pax/cache.c:397]: (error) Possible null pointer dereference: ptr - otherwise 
it is redundant to check if ptr is null at line 408

Scan for /usr/src/cddl/:

contrib/opensolaris/cmd/dtrace/test/cmd/baddof/baddof.c:202]: (error) 
Deallocating a deallocated pointer: fd
[contrib/opensolaris/cmd/lockstat/sym.c:150]: (error) Memory leak: name
[contrib/opensolaris/cmd/sgs/tools/common/sgsmsg.c:311]: (error) Memory leak: 
buffer
[contrib/opensolaris/cmd/sgs/tools/common/sgsmsg.c:503]: (error) Memory leak: 
buf
[contrib/opensolaris/cmd/sgs/tools/common/sgsmsg.c:950]: (error) Common realloc 
mistake: 'token_buffer' nulled but not freed upon failure
[contrib/opensolaris/cmd/zpool/zpool_main.c:4622]: (error) Resource leak: fd
[contrib/opensolaris/lib/libdtrace/common/dt_aggregate.c:568]: (error) Memory 
leak: percpu
[contrib/opensolaris/lib/libdtrace/common/dt_cc.c:2117]: (error) Resource leak: 
dirp
[contrib/opensolaris/lib/libdtrace/common/dt_link.c:1735]: (error) Resource 
leak: fd
[contrib/opensolaris/lib/libdtrace/common/dt_strtab.c:257]: (error) Memory 
leak: hp
[contrib/opensolaris/lib/libzfs/common/libzfs_import.c:1006]: (error) Dan

FreeBSD 9.0-RC2 Available...

2011-11-17 Thread Ken Smith

The second of the Release Candidate builds for the 9.0-RELEASE release
cycle is now available.  Since this is the first release of a brand
new branch I cross-post the announcements on both -current and -stable.
But just so you know most of the developers active in head and stable/9
pay more attention to the -current mailing list.  If you notice problems
you can report them through the normal Gnats PR system or on the
-current mailing list.

At the current plans are for one more RC build, which will be followed
by the release.  The 9.0-RELEASE cycle will be tracked here:

http://wiki.freebsd.org/Releng/9.0TODO

NOTE: The location of the FTP install tree and ISOs is the same as it
had been for BETA2/BETA3/RC1, though we are still deciding if this will
be the layout we switch to for the release.

ISO images for the following architectures are available, with pathnames
given relative to the top-level of the FTP site:

  amd64: .../releases/amd64/amd64/ISO-IMAGES/9.0/
  i386: .../releases/i386/i386/ISO-IMAGES/9.0/
  ia64: .../releases/ia64/ia64/ISO-IMAGES/9.0/
  powerpc: .../releases/powerpc/powerpc/ISO-IMAGES/9.0/
  powerpc64: .../releases/powerpc/powerpc64/ISO-IMAGES/9.0/
  sparc64: .../releases/sparc64/sparc64/ISO-IMAGES/9.0/

MD5/SHA256 checksums are tacked on below.

If you would like to use csup/cvsup mechanisms to access the source
tree the branch tag to use is now "RELENG_9_0", if you use "." (head)
you will get 10-CURRENT.  If you would like to access the source tree
via SVN it is "svn://svn.freebsd.org/base/releng/9.0/".  We still have
the nit that the creation of a new SVN branch winds up causing what
looks like a check-in of the entire tree in CVS (a side-effect of the
svn2cvs exporter) so "mergemaster -F" is your friend if you are using
csup/cvsup.

FreeBSD Update
--

The freebsd-update(8) utility supports binary upgrades of i386 and amd64 systems
running earlier FreeBSD releases. Systems running 7.[34]-RELEASE,
8.[12]-RELEASE, 9.0-BETA[123], or 9.0-RC1 can upgrade as follows:

First, a minor change must be made to the freebsd-update code in order
for it to accept file names appearing in FreeBSD 9.0 which contain the '%'
and '@' characters; without this change, freebsd-update will error out
with the message "The update metadata is correctly signed, but failed an
integrity check".

# sed -i '' -e 's/=_/=%@_/' /usr/sbin/freebsd-update

Now freebsd-update can fetch bits belonging to 9.0-RC2.  During this process
freebsd-update will ask for help in merging configuration files.

# freebsd-update upgrade -r 9.0-RC2

Due to changes in the way that FreeBSD is packaged on the release media, two
complications may arise in this process if upgrading from FreeBSD 7.x or 8.x:
1. The FreeBSD kernel, which previously could appear in either /boot/kernel
or /boot/GENERIC, now only appears as /boot/kernel.  As a result, any kernel
appearing in /boot/GENERIC will be deleted.  Please carefully read the output
printed by freebsd-update and confirm that an updated kernel will be placed
into /boot/kernel before proceeding beyond this point.
2. The FreeBSD source tree in /usr/src (if present) will be deleted.  (Normally
freebsd-update will update a source tree, but in this case the changes in
release packaging result in freebsd-update not recognizing that the source tree
from the old release and the source tree from the new release correspond to the
same part of FreeBSD.)

# freebsd-update install

The system must now be rebooted with the newly installed kernel before the
non-kernel components are updated.

# shutdown -r now

After rebooting, freebsd-update needs to be run again to install the new
userland components:

# freebsd-update install

At this point, users of systems being upgraded from FreeBSD 8.2-RELEASE or
earlier will be prompted by freebsd-update to rebuild all third-party
applications (e.g., ports installed from the ports tree) due to updates in
system libraries.

After updating installed third-party applications (and again, only if
freebsd-update printed a message indicating that this was necessary), run
freebsd-update again so that it can delete the old (no longer used) system
libraries:

# freebsd-update install
Finally, reboot into 9.0-RC2:

# shutdown -r now

Checksums:

MD5 (FreeBSD-9.0-RC2-amd64-bootonly.iso) = 0165f0a2a1141a4c69413ec0c0b7d754
MD5 (FreeBSD-9.0-RC2-amd64-memstick.img) = 84713f2f556cdd58aa18e36093525e6c
MD5 (FreeBSD-9.0-RC2-amd64-dvd1.iso) = 59792b2012e6feff6981d3cf58c0b901

MD5 (FreeBSD-9.0-RC2-i386-bootonly.iso) = ed3e7b8ac2fdadd2c41c0d5c8b26943c
MD5 (FreeBSD-9.0-RC2-i386-memstick.img) = f396728fbd72c61078a7f9511b0c71ff
MD5 (FreeBSD-9.0-RC2-i386-dvd1.iso) = cacc9962fa80a6b9a5067c907f127e8b

MD5 (FreeBSD-9.0-RC2-ia64-bootonly.iso) = faaf6f0c529b8ec59b9d4252ae666dc7
MD5 (FreeBSD-9.0-RC2-ia64-memstick) = b937883e7634334bef1ddf3eb1e06ffb
MD5 (FreeBSD-9.0-RC2-ia64-release.iso) = c1f5623734132ea80a9fa2298262884c

MD5 (FreeBSD-9.0-RC2-powerpc-bootonly.iso) = 35e667deaa72

Re: ld: kernel.debug: Not enough room for program headers (allocated 5, need 6)

2011-11-17 Thread Artem Belevich
On Thu, Nov 17, 2011 at 6:41 AM, David Wolfskill  wrote:
> MAKE=/usr/obj/usr/src/make.i386/make sh /usr/src/sys/conf/newvers.sh GENERIC
> cc -c -O -pipe  -std=c99 -g -Wall -Wredundant-decls -Wnested-externs 
> -Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline 
> -Wcast-qual  -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc  -I. 
> -I/usr/src/sys -I/usr/src/sys/contrib/altq -D_KERNEL 
> -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common 
> -finline-limit=8000 --param inline-unit-growth=100 --param 
> large-function-growth=1000  -mno-align-long-strings 
> -mpreferred-stack-boundary=2  -mno-mmx -mno-3dnow -mno-sse -mno-sse2 
> -mno-sse3 -ffreestanding -fstack-protector -Werror  vers.c
> linking kernel.debug
> ld: kernel.debug: Not enough room for program headers (allocated 5, need 6)
> ld: final link failed: Bad value
> *** Error code 1

> I'm rather left wondering "room" where, precisely?

Room for the program headers at the beginning of the ELF file. Look at
sys/conf/ldscript.* and search for SIZEOF_HEADERS.
One way to work around the issue is to replace SIZEOF_HEADERS with a
fixed value. Try 0x1000.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


ld: kernel.debug: Not enough room for program headers (allocated 5, need 6)

2011-11-17 Thread David Wolfskill
Color me perplexed.

3 of the 4 kernels I build were fine; the 4th one ... ugh.

I'm tracking stable/8 daily & rebuild as often as that (less if there
are no changes).  I do this on 2 machines: my laptop (which only builds
for itself) and a "build machine" (named "freebeast"), which builds
GENERIC for itself, as well as kernels ALBERT & JANUS for a couple of
other machines.

The laptop was fine (for stable/8); it was running:

FreeBSD g1-227.catwhisker.org 8.2-STABLE FreeBSD 8.2-STABLE #272 r227447M: Fri 
Nov 11 04:07:05 PST 2011 
r...@g1-227.catwhisker.org:/common/S1/obj/usr/src/sys/CANARY  i386

and is now running:

FreeBSD g1-227.catwhisker.org 8.2-STABLE FreeBSD 8.2-STABLE #273 r227611M: Thu 
Nov 17 04:16:50 PST 2011 
r...@g1-227.catwhisker.org:/common/S1/obj/usr/src/sys/CANARY  i386

(The "M" suffix on the GRN is for a patch to sys/conf/newvers.sh
so it will recognize my working copy as SVN even though there is
no sys/.svn directory, since I'm using subversion-1.7.1.)

The build machine is running:

FreeBSD freebeast.catwhisker.org 8.2-STABLE FreeBSD 8.2-STABLE #400 r227447M: 
Fri Nov 11 04:11:35 PST 2011 
r...@freebeast.catwhisker.org:/common/S1/obj/usr/src/sys/GENERIC  i386

it rebuilt GENERIC & ALBERT OK, then on JANUS, the "make buildkernel"
terminated with:

...
cc -c -O -pipe  -std=c99 -g -Wall -Wredundant-decls -Wnested-externs 
-Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  
-Wundef -Wno-pointer-sign -fformat-extensions -nostdinc  -I. -I/usr/src/sys 
-I/usr/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include 
opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 
--param large-function-growth=1000  -mno-align-long-strings 
-mpreferred-stack-boundary=2  -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 
-ffreestanding -fstack-protector -Werror  hints.c
cc -c -O -pipe  -std=c99 -g -Wall -Wredundant-decls -Wnested-externs 
-Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  
-Wundef -Wno-pointer-sign -fformat-extensions -nostdinc  -I. -I/usr/src/sys 
-I/usr/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include 
opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 
--param large-function-growth=1000  -mno-align-long-strings 
-mpreferred-stack-boundary=2  -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 
-ffreestanding -fstack-protector -Werror  vnode_if.c
:> hack.c
cc -shared -nostdlib hack.c -o hack.So
rm -f hack.c
MAKE=/usr/obj/usr/src/make.i386/make sh /usr/src/sys/conf/newvers.sh GENERIC
cc -c -O -pipe  -std=c99 -g -Wall -Wredundant-decls -Wnested-externs 
-Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  
-Wundef -Wno-pointer-sign -fformat-extensions -nostdinc  -I. -I/usr/src/sys 
-I/usr/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include 
opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 
--param large-function-growth=1000  -mno-align-long-strings 
-mpreferred-stack-boundary=2  -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 
-ffreestanding -fstack-protector -Werror  vers.c
linking kernel.debug
ld: kernel.debug: Not enough room for program headers (allocated 5, need 6)
ld: final link failed: Bad value
*** Error code 1

Stop in /common/S1/obj/usr/src/sys/JANUS.
*** Error code 1

Stop in /usr/src.
*** Error code 1

Stop in /usr/src.
freebeast(8.2-S)[3] 


I'm rather left wondering "room" where, precisely?


The other perplexing thing is that JANUS is actually a subset of
my laptop's kernel config, and it's been getting built routinely;
as of its most recent update, it is running:

FreeBSD janus.catwhisker.org 8.2-STABLE FreeBSD 8.2-STABLE #398 r227447M: Fri 
Nov 11 04:15:30 PST 2011 
r...@freebeast.catwhisker.org:/common/S1/obj/usr/src/sys/JANUS  i386

And I've not changed the JANUS config since date: 2010/04/18 13:04:27;
author: david;  state: Exp;

Here's a copy:

#
# JANUS -- kernel configuration file for FreeBSD/i386 as a packet filter
#

include GENERIC

# firewall support, for access limiting

options IPFIREWALL
# options   IPFIREWALL_DEFAULT_TO_ACCEPT
options IPFIREWALL_VERBOSE  #enable logging to syslogd(8)
options IPFIREWALL_VERBOSE_LIMIT=0  #do not limit verbosity

# dummynet for bandwidth limiting (requires IPFIREWALL)

options DUMMYNET

# divert sockets for natd

options IPDIVERT

[End of JANUS config]


As noted, since my laptop is also exposed to networks I don't
control, its kernel is also built with the above options (as well
as quite a few more, for support of user-interface, vs. headless
server, operation).  And it built & runs fine

Clues?

Thanks!

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
Depriving a girl or boy of an opportunity for education is evil.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


pgpHXZDdHFbSY.pgp
Description: PGP signature

Re: 8.2 + apache == a LOT of sigprocmask

2011-11-17 Thread Daniil Cherednik

On 17.11.2011 14:18, Jeremy Chadwick wrote:

On Thu, Nov 17, 2011 at 10:12:10AM +0200, Kostik Belousov wrote:

On Wed, Nov 16, 2011 at 11:59:06PM -0800, Doug Barton wrote:

On 11/16/2011 23:49, Kostik Belousov wrote:

On Wed, Nov 16, 2011 at 10:46:27PM -0800, Doug Barton wrote:

On 11/15/2011 02:09, Jeremy Chadwick wrote:

On Tue, Nov 15, 2011 at 11:07:45AM +0200, Kostik Belousov wrote:

On Mon, Nov 14, 2011 at 12:51:35PM -0800, Doug Barton wrote:

On 11/14/2011 12:31, Doug Barton wrote:

Trying to track down a load problem we're seeing on 8.2-RELEASE-p4 i386
in a busy web hosting environment I came across the following post:

http://lists.freebsd.org/pipermail/freebsd-questions/2011-October/234520.html

That basically describes what we're seeing as well, including the
"doesn't happen on Linux" part.

Does anyone have any ideas about this?

With incredibly similar stuff running on 7.x we didn't see this problem,
so it seems to be something new in 8.

Just took a closer look at our ktrace, and actually our pattern is
slightly different than the one in that post. In ours the second option
is null, but the third is set:

74195 httpd0.17 RET   sigprocmask 0
74195 httpd0.13 CALL  sigprocmask(SIG_BLOCK,0,0xbfbf89d4)
74195 httpd0.09 RET   sigprocmask 0
74195 httpd0.13 CALL  sigprocmask(SIG_BLOCK,0,0xbfbf89d4)
74195 httpd0.09 RET   sigprocmask 0
74195 httpd0.12 CALL  sigprocmask(SIG_BLOCK,0,0xbfbf89d4)

But repeated hundreds of times in a row.

The calls cannot come from rtld, they are generated by some setjmp()
invocation. If signal-safety is not needed, sigsetjmp() should be used
instead.

Quick grep of the apache httpd source shows a single setjmp() in their
copy of pcre. No idea is it to safe to change setjmp() into sigsetjmp(?, 0).

I hate cross-posting, but: adding freebsd-apache@ to the list.  Some of
the Apache folks (not just port committers) may have some insight to
Kostik's findings.

Thanks to everyone for the responses. We tried Kostik's suggestion and
unfortunately it didn't reduce the number of sigprocmask() calls to a
statistically significant degree.

Does anyone have any other ideas on ways to debug this? We're sort of
running out of things to test. :-/

Given how important (and prevalent) the Apache + FreeBSD combination is,
I'm kind of disturbed that we're seeing this performance problem, and if
it's something in 8.x that's also in 9.x, it would be better to fix it
prior to 9.0-RELEASE.

Since my guess appeared to be not useful,

Well I wouldn't say that they weren't useful, we eliminated the obvious
candidate. So, "not good news" certainly, but not unhelpful. :)


the way forward is to identify
the location of the call(s) that cause the issue. I suggest compliling
at least apache itself, libc, rtld and libthr (if used) with debugging
information. Then, attach to the running apache worker with the gdb and

Note this part.


set breakpoint on sigprocmask. Several backtraces from the hit breakpoint
should give enough data.

We tried that, and got this:

Loaded symbols for /libexec/ld-elf.so.1
0x28183a5d in accept () from /lib/libc.so.7
(gdb) b sigprocmask
Breakpoint 1 at 0x282d8f84
(gdb) c
Continuing.
no thread to satisfy query
0x28183a5d in accept () from /lib/libc.so.7
(gdb)

It seems your libc has no debugging information.
accept() is the pure syscall wrapper, it cannot call sigprocmask.
If gdb catched the PLT trampoline instead of real accept(),  we would
see the rtld frames. So install libc, libthr and rtld with debug.

Also, having debug symbols for apache itself can be useful.

I'd also like to point out that enabling debugging symbols in devel/apr1
will be greatly needed here, not just in www/apache*.

I'm wondering if maybe this is some sort of pthread "thing" going on.  A
quick grep -r sigmask of the Apache source turns up some pthread_* bits
pertaining to worker.

Is Apache build using WITH_THREADS?  What about devel/apr1?

I don't use worker MPM on any of our boxes, we actually use ITK MPM
solely because of the hosting nature of what we do.  I've actually never
seen worker MPM in use on any *IX machine I've been on or administrated,
only prefork.  The Apache documentation even mentions that "if you want
stability or compatibility, prefork is the choice", while "if you want
scalability, worker is a better choice"[1].  These sorts of quotes often
shock me given what year it is.  :-)

[1]: http://httpd.apache.org/docs/2.0/mpm.html

We use ITK MPM too, but we have big trouble with performance on FreeBSD. 
Also, I have to say we can`t use keep-alive connection, so apache 
creates new child for each request.


--

С уважением,
Daniil Cherednik
.masterhost

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 8.2 + apache == a LOT of sigprocmask

2011-11-17 Thread Tom Evans
On Thu, Nov 17, 2011 at 10:18 AM, Jeremy Chadwick
 wrote:
> I don't use worker MPM on any of our boxes, we actually use ITK MPM
> solely because of the hosting nature of what we do.  I've actually never
> seen worker MPM in use on any *IX machine I've been on or administrated,
> only prefork.  The Apache documentation even mentions that "if you want
> stability or compatibility, prefork is the choice", while "if you want
> scalability, worker is a better choice"[1].  These sorts of quotes often
> shock me given what year it is.  :-)
>

I've used both worker and event MPMs in production on high volume
sites for > 4 years now, running on FreeBSD 7, with no problems. I
think you are cherry picking the quotes from httpd's 2.0
documentation, which is actually an old bit of software now - it has
just been voted EOL. The current stable (2.2) docs actually say:

"sites that need a great deal of scalability can choose to use a
threaded MPM like worker or event, while sites requiring stability or
compatibility with older software can use a prefork"

Event and worker have no issues unless you run non thread safe
modules, or modules which use libraries which are not thread safe, eg
PHP (more commonly, a PHP extension).

Cheers

Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Trouble with SSD on SATA

2011-11-17 Thread Willem Jan Withagen

On 2011-11-17 12:20, Jeremy Chadwick wrote:

On Thu, Nov 17, 2011 at 12:03:26PM +0100, Willem Jan Withagen wrote:

On 2011-11-16 18:22, Peter Maloney wrote:

Willem,

I can only guess, but...

Is AHCI enabled in the bios? If you are not using 'fake-raid' for any
disks, you should [depending on FreeBSD version, HBA, etc.] probably
enable AHCI. Some servers actually come with SATA set in IDE mode. And
if you are using zfs, the controller optimally should not be RAID at
all. And if you have AHCI enabled already, try disabling it (losing hot
swapping ability, and some performance).


ACHI is enabled otherwise I cannot used the last set of SATA
connectors with this MB. Controller for these connectors is CH9.


There are two "kinds" of AHCI on FreeBSD -- and I'm speaking strictly
about the kernel bits, not AHCI the option ROM/BIOS option:

ataahci.ko -- this is "AHCI support using ata(4)"
ahci.ko-- this is "AHCI support using CAM(4)"

You want the latter, and I can tell you're using the former (if at all).


I did 'man ahci', and followed it from there.
So now I'm running with ahci.


There would be no "ata6" if you were using ahci.ko; it would be called
something like ahcichX, indicating "AHCI channel X".  Furthermore,
because CAM(4) gets used, your disk device names change from adX to
adaX.  This is expected.  Using ataahci.ko results in the disks still
being named adX, because it uses ata(4).


ahci0:  port 
0x1c70-0x1c77,0x1c64-0x1c67,0x1c68-0x1c6f,0x1c60-0x1c63,0x1c00-0x1c1f 
mem 0xdf923000-0xdf9237ff irq 17 at device 31.2 on pci0

ahci0: [ITHREAD]
ahci0: AHCI v1.20 with 6 3Gbps ports, Port Multiplier supported
ahcich0:  at channel 0 on ahci0
ahcich0: [ITHREAD]
ahcich1:  at channel 1 on ahci0
ahcich1: [ITHREAD]
ahcich2:  at channel 2 on ahci0
ahcich2: [ITHREAD]
ahcich3:  at channel 3 on ahci0
ahcich3: [ITHREAD]
ahcich4:  at channel 4 on ahci0
ahcich4: [ITHREAD]
ahcich5:  at channel 5 on ahci0
ahcich5: [ITHREAD]


Hope this helps shed some light on the confusion.  Generally speaking
you want to be using ahci.ko, mav@ and many others have spent a lot of
time working on that and getting it to play nice with CAM -- it's
beautiful, and hot-swapping works perfectly on all the Intel ICHxx
systems I've tried it on (ICH7R, ICH9R).


For the time being I can only concur with you. Scrubbing is way much 
faster than with the ata driver.


--WjW



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Trouble with SSD on SATA

2011-11-17 Thread Jeremy Chadwick
On Thu, Nov 17, 2011 at 12:03:26PM +0100, Willem Jan Withagen wrote:
> On 2011-11-16 18:22, Peter Maloney wrote:
> >Willem,
> >
> >I can only guess, but...
> >
> >Is AHCI enabled in the bios? If you are not using 'fake-raid' for any
> >disks, you should [depending on FreeBSD version, HBA, etc.] probably
> >enable AHCI. Some servers actually come with SATA set in IDE mode. And
> >if you are using zfs, the controller optimally should not be RAID at
> >all. And if you have AHCI enabled already, try disabling it (losing hot
> >swapping ability, and some performance).
> 
> ACHI is enabled otherwise I cannot used the last set of SATA
> connectors with this MB. Controller for these connectors is CH9.

There are two "kinds" of AHCI on FreeBSD -- and I'm speaking strictly
about the kernel bits, not AHCI the option ROM/BIOS option:

ataahci.ko -- this is "AHCI support using ata(4)"
ahci.ko-- this is "AHCI support using CAM(4)"

You want the latter, and I can tell you're using the former (if at all).

There would be no "ata6" if you were using ahci.ko; it would be called
something like ahcichX, indicating "AHCI channel X".  Furthermore,
because CAM(4) gets used, your disk device names change from adX to
adaX.  This is expected.  Using ataahci.ko results in the disks still
being named adX, because it uses ata(4).

Hope this helps shed some light on the confusion.  Generally speaking
you want to be using ahci.ko, mav@ and many others have spent a lot of
time working on that and getting it to play nice with CAM -- it's
beautiful, and hot-swapping works perfectly on all the Intel ICHxx
systems I've tried it on (ICH7R, ICH9R).

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, US |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Trouble with SSD on SATA

2011-11-17 Thread Willem Jan Withagen

On 2011-11-16 18:22, Peter Maloney wrote:

Willem,

I can only guess, but...

Is AHCI enabled in the bios? If you are not using 'fake-raid' for any
disks, you should [depending on FreeBSD version, HBA, etc.] probably
enable AHCI. Some servers actually come with SATA set in IDE mode. And
if you are using zfs, the controller optimally should not be RAID at
all. And if you have AHCI enabled already, try disabling it (losing hot
swapping ability, and some performance).


ACHI is enabled otherwise I cannot used the last set of SATA connectors 
with this MB. Controller for these connectors is CH9.



What version of FreeBSD are you using? I had a terrible experience with
ZFS on FreeBSD 8.2 release, and 8.2-stable-April2011. I would recommend
upgrading to the latest 8-stable with cvsup.


I'm at most 1 month behind on STABLE, since I just upgrade in about that 
frequency... Might vary a little on the amount of rumbling on the list.



This thread seems related:
http://forums.freebsd.org/showthread.php?t=24189

The guy was using 8.2 release, and he downgraded to an old version of
the driver to fix, saying that a patch also existed in 8-stable that
fixes the problem.

Are you using an expander?


No SATA expanders...


What HBA / hard disk controller are you using?


A combi of CH9 and ARECA in PCI-X, disks are all exported a single disks.

Thanx for the suggestions

-_WjW


Am 16.11.2011 17:12, schrieb Willem Jan Withagen:

Hi,

I'm getting these:

Nov 16 16:40:49 zfs kernel: ata6: port is not ready (timeout 15000ms)
tfd = 0080
Nov 16 16:40:49 zfs kernel: ata6: hardware reset timeout
Nov 16 16:41:50 zfs kernel: ata6: port is not ready (timeout 15000ms)
tfd = 0080
Nov 16 16:41:50 zfs kernel: ata6: hardware reset timeout

When inserting the tray with a SSD disk connected to that controller.

Which is probably due to a BIOS upgrade
At least it started after upgrading the BIOS. So I'm asking SuperMicro
for an older version.

When this happens, the system sometimes panics, haven't written the
details yet down right now. somewhere in get_devices...

After the panic I really need to powerdown the machine, otherwise it
boots but stalls at finding any disks. It does not just find no disks,
it "freezes" at the point it should report the found disks in the
bios-boot.
So apparently the ata controller are left in a very confused state.

Why is the controller found at boot, and works as it should.
And why later it just starts generating these hardware resets??

--WjW
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Trouble with SSD on SATA

2011-11-17 Thread Willem Jan Withagen

On 2011-11-16 20:55, Alexander Motin wrote:

Hi.

On 16.11.2011 18:12, Willem Jan Withagen wrote:

I'm getting these:

Nov 16 16:40:49 zfs kernel: ata6: port is not ready (timeout 15000ms)
tfd = 0080
Nov 16 16:40:49 zfs kernel: ata6: hardware reset timeout
Nov 16 16:41:50 zfs kernel: ata6: port is not ready (timeout 15000ms)
tfd = 0080
Nov 16 16:41:50 zfs kernel: ata6: hardware reset timeout

When inserting the tray with a SSD disk connected to that controller.

Which is probably due to a BIOS upgrade
At least it started after upgrading the BIOS. So I'm asking SuperMicro
for an older version.

When this happens, the system sometimes panics, haven't written the
details yet down right now. somewhere in get_devices...

After the panic I really need to powerdown the machine, otherwise it
boots but stalls at finding any disks. It does not just find no disks,
it "freezes" at the point it should report the found disks in the
bios-boot.
So apparently the ata controller are left in a very confused state.

Why is the controller found at boot, and works as it should.
And why later it just starts generating these hardware resets??


Looking on messages, I would say that you are using AHCI controller with
old ata(4) driver. I would recommend you to try new ahci(4) driver. It
has better hot-plug support and also supports NCQ and some other
features. Note that disks connected to it will be reported as adaX
instead of adY.


Hi Alexander,

Thanx for pointing that out.
I recompiled the kernel with ahci..

And using GPT for the most part took care of the fact that the 
underlying devicenames changed
Only "problem" was swap, which I renamed from ad{6,8} to ada{6,8} but 
ahci also renumbers However on swap that is not much of a problem 
during booting.


the root partition however is running of:
zfsboot4.16G  62.3G  0  0  0  0
  mirror   4.16G  62.3G  0  0  0  0
gptid/966bdc14-0b73-11df-a9ff-003048de97cd  -  -  0 
  0  0  0
gptid/60be2c5d-4a83-11df-bf4f-003048de97cd  -  -  0 
  0  0  0


But they were not labeled as such in GPT, so that sor t of makes sense.
And I've seen a lot of discussion on how to try and fix this. But I 
think that at the moment I will not bother.


Performance wise I have the feeling that it has al lot better 
performance. It was scrubbing a 6,5T filesystem and read io-ops where 
around 100-200 with ata, but now they are more in the 600-900 range.


Let's see how we fare with this setting.

--WjW




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 8.2 + apache == a LOT of sigprocmask

2011-11-17 Thread Kostik Belousov
On Thu, Nov 17, 2011 at 01:26:49AM -0800, Doug Barton wrote:
> On 11/17/2011 00:12, Kostik Belousov wrote:
> > On Wed, Nov 16, 2011 at 11:59:06PM -0800, Doug Barton wrote:
> >> On 11/16/2011 23:49, Kostik Belousov wrote:
> >>> On Wed, Nov 16, 2011 at 10:46:27PM -0800, Doug Barton wrote:
>  On 11/15/2011 02:09, Jeremy Chadwick wrote:
> > On Tue, Nov 15, 2011 at 11:07:45AM +0200, Kostik Belousov wrote:
> >> On Mon, Nov 14, 2011 at 12:51:35PM -0800, Doug Barton wrote:
> >>> On 11/14/2011 12:31, Doug Barton wrote:
>  Trying to track down a load problem we're seeing on 8.2-RELEASE-p4 
>  i386
>  in a busy web hosting environment I came across the following post:
> 
>  http://lists.freebsd.org/pipermail/freebsd-questions/2011-October/234520.html
> 
>  That basically describes what we're seeing as well, including the
>  "doesn't happen on Linux" part.
> 
>  Does anyone have any ideas about this?
> 
>  With incredibly similar stuff running on 7.x we didn't see this 
>  problem,
>  so it seems to be something new in 8.
> >>>
> >>> Just took a closer look at our ktrace, and actually our pattern is
> >>> slightly different than the one in that post. In ours the second 
> >>> option
> >>> is null, but the third is set:
> >>>
> >>> 74195 httpd0.17 RET   sigprocmask 0
> >>> 74195 httpd0.13 CALL  sigprocmask(SIG_BLOCK,0,0xbfbf89d4)
> >>> 74195 httpd0.09 RET   sigprocmask 0
> >>> 74195 httpd0.13 CALL  sigprocmask(SIG_BLOCK,0,0xbfbf89d4)
> >>> 74195 httpd0.09 RET   sigprocmask 0
> >>> 74195 httpd0.12 CALL  sigprocmask(SIG_BLOCK,0,0xbfbf89d4)
> >>>
> >>> But repeated hundreds of times in a row.
> >>
> >> The calls cannot come from rtld, they are generated by some setjmp()
> >> invocation. If signal-safety is not needed, sigsetjmp() should be used
> >> instead.
> >>
> >> Quick grep of the apache httpd source shows a single setjmp() in their
> >> copy of pcre. No idea is it to safe to change setjmp() into 
> >> sigsetjmp(?, 0).
> >
> > I hate cross-posting, but: adding freebsd-apache@ to the list.  Some of
> > the Apache folks (not just port committers) may have some insight to
> > Kostik's findings.
> 
>  Thanks to everyone for the responses. We tried Kostik's suggestion and
>  unfortunately it didn't reduce the number of sigprocmask() calls to a
>  statistically significant degree.
> 
>  Does anyone have any other ideas on ways to debug this? We're sort of
>  running out of things to test. :-/
> 
>  Given how important (and prevalent) the Apache + FreeBSD combination is,
>  I'm kind of disturbed that we're seeing this performance problem, and if
>  it's something in 8.x that's also in 9.x, it would be better to fix it
>  prior to 9.0-RELEASE.
> >>>
> >>> Since my guess appeared to be not useful,
> >>
> >> Well I wouldn't say that they weren't useful, we eliminated the obvious
> >> candidate. So, "not good news" certainly, but not unhelpful. :)
> >>
> >>> the way forward is to identify
> >>> the location of the call(s) that cause the issue. I suggest compliling
> >>> at least apache itself, libc, rtld and libthr (if used) with debugging
> >>> information. Then, attach to the running apache worker with the gdb and
> > Note this part.
> 
> Right, we attached to a worker, that's why it's in accept(). :)
> 
> > It seems your libc has no debugging information.
> > accept() is the pure syscall wrapper, it cannot call sigprocmask.
> > If gdb catched the PLT trampoline instead of real accept(),  we would
> > see the rtld frames. So install libc, libthr and rtld with debug.
> 
> It's not catching there though:
> 
> Reading symbols from /libexec/ld-elf.so.1...done.
> Loaded symbols for /libexec/ld-elf.so.1
> 0x28183b2d in accept () at accept.S:3
> 3 RSYSCALL(accept)
> (gdb) c
> Continuing.
> no thread to satisfy query
> 0x28183b2d in accept () at accept.S:3
> 3 RSYSCALL(accept)
> (gdb) info threads
> Cannot get thread info: invalid key
> (gdb)

Err, the other part of my message was that you shall set the breakpoint
on sigprocmask. I want to see a backtrace from the breakpoint hit.
Several times.

The backtrace at the attach time has no use.


pgptW6yGgAFjw.pgp
Description: PGP signature


Re: 8.2 + apache == a LOT of sigprocmask

2011-11-17 Thread Lev Serebryakov
Hello, Kostik.
You wrote 17 ноября 2011 г., 11:49:09:

> High-tech solution is to link with libunwind and add code into sigprocmask()
> to gather the stacks. But I expect that gdb attach is enough.
 Proper high-tech solution is to use DTrace. It is very food in such
 things.

-- 
// Black Lion AKA Lev Serebryakov 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 8.2 + apache == a LOT of sigprocmask

2011-11-17 Thread Jeremy Chadwick
On Thu, Nov 17, 2011 at 10:12:10AM +0200, Kostik Belousov wrote:
> On Wed, Nov 16, 2011 at 11:59:06PM -0800, Doug Barton wrote:
> > On 11/16/2011 23:49, Kostik Belousov wrote:
> > > On Wed, Nov 16, 2011 at 10:46:27PM -0800, Doug Barton wrote:
> > >> On 11/15/2011 02:09, Jeremy Chadwick wrote:
> > >>> On Tue, Nov 15, 2011 at 11:07:45AM +0200, Kostik Belousov wrote:
> >  On Mon, Nov 14, 2011 at 12:51:35PM -0800, Doug Barton wrote:
> > > On 11/14/2011 12:31, Doug Barton wrote:
> > >> Trying to track down a load problem we're seeing on 8.2-RELEASE-p4 
> > >> i386
> > >> in a busy web hosting environment I came across the following post:
> > >>
> > >> http://lists.freebsd.org/pipermail/freebsd-questions/2011-October/234520.html
> > >>
> > >> That basically describes what we're seeing as well, including the
> > >> "doesn't happen on Linux" part.
> > >>
> > >> Does anyone have any ideas about this?
> > >>
> > >> With incredibly similar stuff running on 7.x we didn't see this 
> > >> problem,
> > >> so it seems to be something new in 8.
> > >
> > > Just took a closer look at our ktrace, and actually our pattern is
> > > slightly different than the one in that post. In ours the second 
> > > option
> > > is null, but the third is set:
> > >
> > > 74195 httpd0.17 RET   sigprocmask 0
> > > 74195 httpd0.13 CALL  sigprocmask(SIG_BLOCK,0,0xbfbf89d4)
> > > 74195 httpd0.09 RET   sigprocmask 0
> > > 74195 httpd0.13 CALL  sigprocmask(SIG_BLOCK,0,0xbfbf89d4)
> > > 74195 httpd0.09 RET   sigprocmask 0
> > > 74195 httpd0.12 CALL  sigprocmask(SIG_BLOCK,0,0xbfbf89d4)
> > >
> > > But repeated hundreds of times in a row.
> > 
> >  The calls cannot come from rtld, they are generated by some setjmp()
> >  invocation. If signal-safety is not needed, sigsetjmp() should be used
> >  instead.
> > 
> >  Quick grep of the apache httpd source shows a single setjmp() in their
> >  copy of pcre. No idea is it to safe to change setjmp() into 
> >  sigsetjmp(?, 0).
> > >>>
> > >>> I hate cross-posting, but: adding freebsd-apache@ to the list.  Some of
> > >>> the Apache folks (not just port committers) may have some insight to
> > >>> Kostik's findings.
> > >>
> > >> Thanks to everyone for the responses. We tried Kostik's suggestion and
> > >> unfortunately it didn't reduce the number of sigprocmask() calls to a
> > >> statistically significant degree.
> > >>
> > >> Does anyone have any other ideas on ways to debug this? We're sort of
> > >> running out of things to test. :-/
> > >>
> > >> Given how important (and prevalent) the Apache + FreeBSD combination is,
> > >> I'm kind of disturbed that we're seeing this performance problem, and if
> > >> it's something in 8.x that's also in 9.x, it would be better to fix it
> > >> prior to 9.0-RELEASE.
> > > 
> > > Since my guess appeared to be not useful,
> > 
> > Well I wouldn't say that they weren't useful, we eliminated the obvious
> > candidate. So, "not good news" certainly, but not unhelpful. :)
> > 
> > > the way forward is to identify
> > > the location of the call(s) that cause the issue. I suggest compliling
> > > at least apache itself, libc, rtld and libthr (if used) with debugging
> > > information. Then, attach to the running apache worker with the gdb and
> Note this part.
> 
> > > set breakpoint on sigprocmask. Several backtraces from the hit breakpoint
> > > should give enough data.
> > 
> > We tried that, and got this:
> > 
> > Loaded symbols for /libexec/ld-elf.so.1
> > 0x28183a5d in accept () from /lib/libc.so.7
> > (gdb) b sigprocmask
> > Breakpoint 1 at 0x282d8f84
> > (gdb) c
> > Continuing.
> > no thread to satisfy query
> > 0x28183a5d in accept () from /lib/libc.so.7
> > (gdb)
> It seems your libc has no debugging information.
> accept() is the pure syscall wrapper, it cannot call sigprocmask.
> If gdb catched the PLT trampoline instead of real accept(),  we would
> see the rtld frames. So install libc, libthr and rtld with debug.
> 
> Also, having debug symbols for apache itself can be useful.

I'd also like to point out that enabling debugging symbols in devel/apr1
will be greatly needed here, not just in www/apache*.

I'm wondering if maybe this is some sort of pthread "thing" going on.  A
quick grep -r sigmask of the Apache source turns up some pthread_* bits
pertaining to worker.

Is Apache build using WITH_THREADS?  What about devel/apr1?

I don't use worker MPM on any of our boxes, we actually use ITK MPM
solely because of the hosting nature of what we do.  I've actually never
seen worker MPM in use on any *IX machine I've been on or administrated,
only prefork.  The Apache documentation even mentions that "if you want
stability or compatibility, prefork is the choice", while "if you want
scalability, worker is a better choice"[1].  These sorts of quotes often
shock me given

Re: 8.2 + apache == a LOT of sigprocmask

2011-11-17 Thread Doug Barton
On 11/17/2011 00:12, Kostik Belousov wrote:
> On Wed, Nov 16, 2011 at 11:59:06PM -0800, Doug Barton wrote:
>> On 11/16/2011 23:49, Kostik Belousov wrote:
>>> On Wed, Nov 16, 2011 at 10:46:27PM -0800, Doug Barton wrote:
 On 11/15/2011 02:09, Jeremy Chadwick wrote:
> On Tue, Nov 15, 2011 at 11:07:45AM +0200, Kostik Belousov wrote:
>> On Mon, Nov 14, 2011 at 12:51:35PM -0800, Doug Barton wrote:
>>> On 11/14/2011 12:31, Doug Barton wrote:
 Trying to track down a load problem we're seeing on 8.2-RELEASE-p4 i386
 in a busy web hosting environment I came across the following post:

 http://lists.freebsd.org/pipermail/freebsd-questions/2011-October/234520.html

 That basically describes what we're seeing as well, including the
 "doesn't happen on Linux" part.

 Does anyone have any ideas about this?

 With incredibly similar stuff running on 7.x we didn't see this 
 problem,
 so it seems to be something new in 8.
>>>
>>> Just took a closer look at our ktrace, and actually our pattern is
>>> slightly different than the one in that post. In ours the second option
>>> is null, but the third is set:
>>>
>>> 74195 httpd0.17 RET   sigprocmask 0
>>> 74195 httpd0.13 CALL  sigprocmask(SIG_BLOCK,0,0xbfbf89d4)
>>> 74195 httpd0.09 RET   sigprocmask 0
>>> 74195 httpd0.13 CALL  sigprocmask(SIG_BLOCK,0,0xbfbf89d4)
>>> 74195 httpd0.09 RET   sigprocmask 0
>>> 74195 httpd0.12 CALL  sigprocmask(SIG_BLOCK,0,0xbfbf89d4)
>>>
>>> But repeated hundreds of times in a row.
>>
>> The calls cannot come from rtld, they are generated by some setjmp()
>> invocation. If signal-safety is not needed, sigsetjmp() should be used
>> instead.
>>
>> Quick grep of the apache httpd source shows a single setjmp() in their
>> copy of pcre. No idea is it to safe to change setjmp() into sigsetjmp(?, 
>> 0).
>
> I hate cross-posting, but: adding freebsd-apache@ to the list.  Some of
> the Apache folks (not just port committers) may have some insight to
> Kostik's findings.

 Thanks to everyone for the responses. We tried Kostik's suggestion and
 unfortunately it didn't reduce the number of sigprocmask() calls to a
 statistically significant degree.

 Does anyone have any other ideas on ways to debug this? We're sort of
 running out of things to test. :-/

 Given how important (and prevalent) the Apache + FreeBSD combination is,
 I'm kind of disturbed that we're seeing this performance problem, and if
 it's something in 8.x that's also in 9.x, it would be better to fix it
 prior to 9.0-RELEASE.
>>>
>>> Since my guess appeared to be not useful,
>>
>> Well I wouldn't say that they weren't useful, we eliminated the obvious
>> candidate. So, "not good news" certainly, but not unhelpful. :)
>>
>>> the way forward is to identify
>>> the location of the call(s) that cause the issue. I suggest compliling
>>> at least apache itself, libc, rtld and libthr (if used) with debugging
>>> information. Then, attach to the running apache worker with the gdb and
> Note this part.

Right, we attached to a worker, that's why it's in accept(). :)

> It seems your libc has no debugging information.
> accept() is the pure syscall wrapper, it cannot call sigprocmask.
> If gdb catched the PLT trampoline instead of real accept(),  we would
> see the rtld frames. So install libc, libthr and rtld with debug.

It's not catching there though:

Reading symbols from /libexec/ld-elf.so.1...done.
Loaded symbols for /libexec/ld-elf.so.1
0x28183b2d in accept () at accept.S:3
3   RSYSCALL(accept)
(gdb) c
Continuing.
no thread to satisfy query
0x28183b2d in accept () at accept.S:3
3   RSYSCALL(accept)
(gdb) info threads
Cannot get thread info: invalid key
(gdb)



Doug

-- 

"We could put the whole Internet into a book."
"Too practical."

Breadth of IT experience, and depth of knowledge in the DNS.
Yours for the right price.  :)  http://SupersetSolutions.com/

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 8.2 + apache == a LOT of sigprocmask

2011-11-17 Thread Doug Barton
On 11/17/2011 00:30, Daniil Cherednik wrote:
> I am sorry for repeat (I wrote about it), but what do you think about
> this hack:

Danill, thanks, and sorry if I wasn't clear before, but the problem
we're seeing has a very clear pattern:

74195 httpd0.13 CALL  sigprocmask(SIG_BLOCK,0,0xbfbf89d4)

That the rtld calls don't exhibit.

Kostik, thanks for your more detailed response, we'll poke that a bit
and report back.


Doug

-- 

"We could put the whole Internet into a book."
"Too practical."

Breadth of IT experience, and depth of knowledge in the DNS.
Yours for the right price.  :)  http://SupersetSolutions.com/

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 8.2 + apache == a LOT of sigprocmask

2011-11-17 Thread Daniil Cherednik

On 17.11.2011 11:49, Kostik Belousov wrote:

On Wed, Nov 16, 2011 at 10:46:27PM -0800, Doug Barton wrote:

On 11/15/2011 02:09, Jeremy Chadwick wrote:

On Tue, Nov 15, 2011 at 11:07:45AM +0200, Kostik Belousov wrote:

On Mon, Nov 14, 2011 at 12:51:35PM -0800, Doug Barton wrote:

On 11/14/2011 12:31, Doug Barton wrote:

Trying to track down a load problem we're seeing on 8.2-RELEASE-p4 i386
in a busy web hosting environment I came across the following post:

http://lists.freebsd.org/pipermail/freebsd-questions/2011-October/234520.html

That basically describes what we're seeing as well, including the
"doesn't happen on Linux" part.

Does anyone have any ideas about this?

With incredibly similar stuff running on 7.x we didn't see this problem,
so it seems to be something new in 8.

Just took a closer look at our ktrace, and actually our pattern is
slightly different than the one in that post. In ours the second option
is null, but the third is set:

74195 httpd0.17 RET   sigprocmask 0
74195 httpd0.13 CALL  sigprocmask(SIG_BLOCK,0,0xbfbf89d4)
74195 httpd0.09 RET   sigprocmask 0
74195 httpd0.13 CALL  sigprocmask(SIG_BLOCK,0,0xbfbf89d4)
74195 httpd0.09 RET   sigprocmask 0
74195 httpd0.12 CALL  sigprocmask(SIG_BLOCK,0,0xbfbf89d4)

But repeated hundreds of times in a row.

The calls cannot come from rtld, they are generated by some setjmp()
invocation. If signal-safety is not needed, sigsetjmp() should be used
instead.

Quick grep of the apache httpd source shows a single setjmp() in their
copy of pcre. No idea is it to safe to change setjmp() into sigsetjmp(?, 0).

I hate cross-posting, but: adding freebsd-apache@ to the list.  Some of
the Apache folks (not just port committers) may have some insight to
Kostik's findings.

Thanks to everyone for the responses. We tried Kostik's suggestion and
unfortunately it didn't reduce the number of sigprocmask() calls to a
statistically significant degree.

Does anyone have any other ideas on ways to debug this? We're sort of
running out of things to test. :-/

Given how important (and prevalent) the Apache + FreeBSD combination is,
I'm kind of disturbed that we're seeing this performance problem, and if
it's something in 8.x that's also in 9.x, it would be better to fix it
prior to 9.0-RELEASE.

Since my guess appeared to be not useful, the way forward is to identify
the location of the call(s) that cause the issue. I suggest compliling
at least apache itself, libc, rtld and libthr (if used) with debugging
information. Then, attach to the running apache worker with the gdb and
set breakpoint on sigprocmask. Several backtraces from the hit breakpoint
should give enough data.

High-tech solution is to link with libunwind and add code into sigprocmask()
to gather the stacks. But I expect that gdb attach is enough.
I am sorry for repeat (I wrote about it), but what do you think about 
this hack:


diff -u ./rtld_lock.c.orig ./rtld_lock.c
--- ./rtld_lock.c.orig 2011-11-15 07:56:14.0 +
+++ ./rtld_lock.c 2011-11-15 07:54:42.0 +
@@ -118,7 +118,7 @@
sigset_t tmp_oldsigmask;

for ( ; ; ) {
- sigprocmask(SIG_BLOCK, &fullsigmask, &tmp_oldsigmask);
+// sigprocmask(SIG_BLOCK, &fullsigmask, &tmp_oldsigmask);
if (atomic_cmpset_acq_int(&l->lock, 0, WAFLAG))
break;
sigprocmask(SIG_SETMASK, &tmp_oldsigmask, NULL);
@@ -135,7 +135,7 @@
atomic_add_rel_int(&l->lock, -RC_INCR);
else {
atomic_add_rel_int(&l->lock, -WAFLAG);
- sigprocmask(SIG_SETMASK, &oldsigmask, NULL);
+// sigprocmask(SIG_SETMASK, &oldsigmask, NULL);
}
}

This is one of source sigprocmask. Look:
truss with original /libexec/ld-elf.so.1:

#truss true
__sysctl(0xbfbfe624,0x2,0xbfbfe62c,0xbfbfe630,0x0,0x0) = 0 (0x0)
mmap(0x0,320,PROT_READ|PROT_WRITE,MAP_ANON,-1,0x0) = 671633408 (0x28085000)
munmap(0x28085000,320) = 0 (0x0)
__sysctl(0xbfbfe688,0x2,0x2807be3c,0xbfbfe690,0x0,0x0) = 0 (0x0)
mmap(0x0,32768,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 
671633408 (0x28085000)

issetugid(0x28074967,0xbfbfeb4c,0x104,0x0,0x0,0x0) = 0 (0x0)
open("/etc/libmap.conf",O_RDONLY,0666) ERR#2 'No such file or directory'
open("/var/run/ld-elf.so.hints",O_RDONLY,00) = 3 (0x3)
read(3,"Ehnt\^A\0\0\0\M^@\0\0\0\M^E\0\0"...,128) = 128 (0x80)
lseek(3,0x80,SEEK_SET) = 128 (0x80)
read(3,"/lib:/usr/lib:/usr/lib/compat:/u"...,133) = 133 (0x85)
close(3) = 0 (0x0)
access("/lib/libc.so.7",0) = 0 (0x0)
open("/lib/libc.so.7",O_RDONLY,00) = 3 (0x3)
fstat(3,{ mode=-r--r--r-- ,inode=94234,size=1155172,blksize=16384 }) = 0 
(0x0)

pread(0x3,0x2807ad80,0x1000,0x0,0x0,0x0) = 4096 (0x1000)
mmap(0x0,1159168,PROT_NONE,MAP_PRIVATE|MAP_ANON|MAP_NOCORE,-1,0x0) = 
671666176 (0x2808d000)
mmap(0x2808d000,1040384,PROT_READ|PROT_EXEC,MAP_PRIVATE|MAP_FIXED|MAP_NOCORE,3,0x0) 
= 671666176 (0x2808d000)
mmap(0x2818b000,24576,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_FIXED,3,0xfe000) 
= 672706560 (0x2818b000)

mprotect(0x28191000,94208,PROT_READ|PROT_WRITE) = 0 (0x0)
close(3) = 0 (0x0)
sysarch(0xa,0xbfbfe6f0,0x2804b

Re: 8.2 + apache == a LOT of sigprocmask

2011-11-17 Thread Kostik Belousov
On Wed, Nov 16, 2011 at 11:59:06PM -0800, Doug Barton wrote:
> On 11/16/2011 23:49, Kostik Belousov wrote:
> > On Wed, Nov 16, 2011 at 10:46:27PM -0800, Doug Barton wrote:
> >> On 11/15/2011 02:09, Jeremy Chadwick wrote:
> >>> On Tue, Nov 15, 2011 at 11:07:45AM +0200, Kostik Belousov wrote:
>  On Mon, Nov 14, 2011 at 12:51:35PM -0800, Doug Barton wrote:
> > On 11/14/2011 12:31, Doug Barton wrote:
> >> Trying to track down a load problem we're seeing on 8.2-RELEASE-p4 i386
> >> in a busy web hosting environment I came across the following post:
> >>
> >> http://lists.freebsd.org/pipermail/freebsd-questions/2011-October/234520.html
> >>
> >> That basically describes what we're seeing as well, including the
> >> "doesn't happen on Linux" part.
> >>
> >> Does anyone have any ideas about this?
> >>
> >> With incredibly similar stuff running on 7.x we didn't see this 
> >> problem,
> >> so it seems to be something new in 8.
> >
> > Just took a closer look at our ktrace, and actually our pattern is
> > slightly different than the one in that post. In ours the second option
> > is null, but the third is set:
> >
> > 74195 httpd0.17 RET   sigprocmask 0
> > 74195 httpd0.13 CALL  sigprocmask(SIG_BLOCK,0,0xbfbf89d4)
> > 74195 httpd0.09 RET   sigprocmask 0
> > 74195 httpd0.13 CALL  sigprocmask(SIG_BLOCK,0,0xbfbf89d4)
> > 74195 httpd0.09 RET   sigprocmask 0
> > 74195 httpd0.12 CALL  sigprocmask(SIG_BLOCK,0,0xbfbf89d4)
> >
> > But repeated hundreds of times in a row.
> 
>  The calls cannot come from rtld, they are generated by some setjmp()
>  invocation. If signal-safety is not needed, sigsetjmp() should be used
>  instead.
> 
>  Quick grep of the apache httpd source shows a single setjmp() in their
>  copy of pcre. No idea is it to safe to change setjmp() into sigsetjmp(?, 
>  0).
> >>>
> >>> I hate cross-posting, but: adding freebsd-apache@ to the list.  Some of
> >>> the Apache folks (not just port committers) may have some insight to
> >>> Kostik's findings.
> >>
> >> Thanks to everyone for the responses. We tried Kostik's suggestion and
> >> unfortunately it didn't reduce the number of sigprocmask() calls to a
> >> statistically significant degree.
> >>
> >> Does anyone have any other ideas on ways to debug this? We're sort of
> >> running out of things to test. :-/
> >>
> >> Given how important (and prevalent) the Apache + FreeBSD combination is,
> >> I'm kind of disturbed that we're seeing this performance problem, and if
> >> it's something in 8.x that's also in 9.x, it would be better to fix it
> >> prior to 9.0-RELEASE.
> > 
> > Since my guess appeared to be not useful,
> 
> Well I wouldn't say that they weren't useful, we eliminated the obvious
> candidate. So, "not good news" certainly, but not unhelpful. :)
> 
> > the way forward is to identify
> > the location of the call(s) that cause the issue. I suggest compliling
> > at least apache itself, libc, rtld and libthr (if used) with debugging
> > information. Then, attach to the running apache worker with the gdb and
Note this part.

> > set breakpoint on sigprocmask. Several backtraces from the hit breakpoint
> > should give enough data.
> 
> We tried that, and got this:
> 
> Loaded symbols for /libexec/ld-elf.so.1
> 0x28183a5d in accept () from /lib/libc.so.7
> (gdb) b sigprocmask
> Breakpoint 1 at 0x282d8f84
> (gdb) c
> Continuing.
> no thread to satisfy query
> 0x28183a5d in accept () from /lib/libc.so.7
> (gdb)
It seems your libc has no debugging information.
accept() is the pure syscall wrapper, it cannot call sigprocmask.
If gdb catched the PLT trampoline instead of real accept(),  we would
see the rtld frames. So install libc, libthr and rtld with debug.

Also, having debug symbols for apache itself can be useful.

> 
> Of course I'm not the world's greatest gdb'er, so maybe there is a
> better way to do it?
> 
> > High-tech solution is to link with libunwind and add code into sigprocmask()
> > to gather the stacks. But I expect that gdb attach is enough.
> 
> Ok, we'll look into that, thanks.
> 
> 
> Doug
> 
> -- 
> 
>   "We could put the whole Internet into a book."
>   "Too practical."
> 
>   Breadth of IT experience, and depth of knowledge in the DNS.
>   Yours for the right price.  :)  http://SupersetSolutions.com/


pgp8pbhWv1A3X.pgp
Description: PGP signature


Re: 8.2 + apache == a LOT of sigprocmask

2011-11-17 Thread Doug Barton
On 11/16/2011 23:49, Kostik Belousov wrote:
> On Wed, Nov 16, 2011 at 10:46:27PM -0800, Doug Barton wrote:
>> On 11/15/2011 02:09, Jeremy Chadwick wrote:
>>> On Tue, Nov 15, 2011 at 11:07:45AM +0200, Kostik Belousov wrote:
 On Mon, Nov 14, 2011 at 12:51:35PM -0800, Doug Barton wrote:
> On 11/14/2011 12:31, Doug Barton wrote:
>> Trying to track down a load problem we're seeing on 8.2-RELEASE-p4 i386
>> in a busy web hosting environment I came across the following post:
>>
>> http://lists.freebsd.org/pipermail/freebsd-questions/2011-October/234520.html
>>
>> That basically describes what we're seeing as well, including the
>> "doesn't happen on Linux" part.
>>
>> Does anyone have any ideas about this?
>>
>> With incredibly similar stuff running on 7.x we didn't see this problem,
>> so it seems to be something new in 8.
>
> Just took a closer look at our ktrace, and actually our pattern is
> slightly different than the one in that post. In ours the second option
> is null, but the third is set:
>
> 74195 httpd0.17 RET   sigprocmask 0
> 74195 httpd0.13 CALL  sigprocmask(SIG_BLOCK,0,0xbfbf89d4)
> 74195 httpd0.09 RET   sigprocmask 0
> 74195 httpd0.13 CALL  sigprocmask(SIG_BLOCK,0,0xbfbf89d4)
> 74195 httpd0.09 RET   sigprocmask 0
> 74195 httpd0.12 CALL  sigprocmask(SIG_BLOCK,0,0xbfbf89d4)
>
> But repeated hundreds of times in a row.

 The calls cannot come from rtld, they are generated by some setjmp()
 invocation. If signal-safety is not needed, sigsetjmp() should be used
 instead.

 Quick grep of the apache httpd source shows a single setjmp() in their
 copy of pcre. No idea is it to safe to change setjmp() into sigsetjmp(?, 
 0).
>>>
>>> I hate cross-posting, but: adding freebsd-apache@ to the list.  Some of
>>> the Apache folks (not just port committers) may have some insight to
>>> Kostik's findings.
>>
>> Thanks to everyone for the responses. We tried Kostik's suggestion and
>> unfortunately it didn't reduce the number of sigprocmask() calls to a
>> statistically significant degree.
>>
>> Does anyone have any other ideas on ways to debug this? We're sort of
>> running out of things to test. :-/
>>
>> Given how important (and prevalent) the Apache + FreeBSD combination is,
>> I'm kind of disturbed that we're seeing this performance problem, and if
>> it's something in 8.x that's also in 9.x, it would be better to fix it
>> prior to 9.0-RELEASE.
> 
> Since my guess appeared to be not useful,

Well I wouldn't say that they weren't useful, we eliminated the obvious
candidate. So, "not good news" certainly, but not unhelpful. :)

> the way forward is to identify
> the location of the call(s) that cause the issue. I suggest compliling
> at least apache itself, libc, rtld and libthr (if used) with debugging
> information. Then, attach to the running apache worker with the gdb and
> set breakpoint on sigprocmask. Several backtraces from the hit breakpoint
> should give enough data.

We tried that, and got this:

Loaded symbols for /libexec/ld-elf.so.1
0x28183a5d in accept () from /lib/libc.so.7
(gdb) b sigprocmask
Breakpoint 1 at 0x282d8f84
(gdb) c
Continuing.
no thread to satisfy query
0x28183a5d in accept () from /lib/libc.so.7
(gdb)

Of course I'm not the world's greatest gdb'er, so maybe there is a
better way to do it?

> High-tech solution is to link with libunwind and add code into sigprocmask()
> to gather the stacks. But I expect that gdb attach is enough.

Ok, we'll look into that, thanks.


Doug

-- 

"We could put the whole Internet into a book."
"Too practical."

Breadth of IT experience, and depth of knowledge in the DNS.
Yours for the right price.  :)  http://SupersetSolutions.com/

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"