from:"Mel Flynn"

Re: FreeBSD Boot Times

2012-06-13 Thread Mel Flynn

On 13-6-2012 23:16, claudiu vasadi wrote:

 
 If you simplky do sysctl -d hw.usb.no_boot_wait you will see the
 explanation ;)

Probably why Eitan asked as that description:
a) means nothing to people unfamiliar with device enumerations
b) does not point to a manual page that explains how USB does device
enumerations and why it would account for a significant chunk of the
boot process.

-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: FreeBSD Boot Times

2012-06-13 Thread Mel Flynn

On 12-6-2012 0:51, Garrett Cooper wrote:
 On Mon, Jun 11, 2012 at 3:21 PM, Brandon Falk bfalk_...@brandonfa.lk wrote:
 Greetings,

 I was just wondering what it is that FreeBSD does that makes it take so long
 to boot. Booting into Ubuntu minimal or my own custom Linux distro,
 literally takes 0.5-2 seconds to boot up to shell, where FreeBSD takes about
 10-20 seconds. I'm not sure if anything could be parallelized in the boot
 process, but Linux somehow manages to do it. The Ubuntu install I do pretty
 much consists of a shell and developers tools, but it still has a generic
 kernel. There must be some sort of polling done in the FreeBSD boot process
 that could be parallelized or eliminated.

 Anyone have any suggestions?

 Note: This isn't really an issue, moreso a curiosity.
 
 The single process nature of rc is a big part of the problem, as
 is the single AP bootup of FreeBSD right before multiuser mode. There
 are a number of threads that discuss this (look for parallel rc bootup
 or something like that in the current, hacker, and rc archives -- the
 most recent discussion was probably 6~9 months ago).
 Given past experience, a big part of getting past the parallelized
 rc mess would be to make services fail/wait gracefully for all their
 resources to come up before proceeding. It's not easy, but it's
 possible with enough resources.

I realize people are working on this and that it's generally a good
thing, however - please don't underestimate the importance of getting an
accurate list of what boots when and equally important how it shuts down.
rcorder is vary valuable in diagnosing why certain services fail to
start or throw fits, but you have to be able to match it with output
from the rc script (something that not all scripts do I might add).
-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: nss compat shims

2012-06-09 Thread Mel Flynn

On 8-6-2012 22:06, Michael Bushkov wrote:

 I don't know for sure, but it seems that there was no need to support
 anything besides groups and password when nss_compat.c was committed.
 At that time, IIRC, the modules that we had in ports supported only
 these databases.

Right. In the meantime hosts has been implemented in nss-pam-ldapd but
currently does not work (at least on 9). It's been the author's concern
too whether services/protocols etc is a needed feature, but I guess you
won't find that out until you provide the feature. At least it seems
logical to me to keep the nss_compat interface synchronized with the
APIs that call nsdispatch. I guess I'll make patches for both.

-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

nss compat shims

2012-06-07 Thread Mel Flynn

Hi,

I'm currently implementing support for more NSS databases in
net/nss-pamd-ldapd and I've started to wonder why there's only the group
and password support in lib/libc/net/nss_compat.c?

Is there a specific reason for it or is this a classic case of patches
welcome? I wasn't able to determine the reason from the commit log.
-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: nvidia-driver-295.49 is highly unstable

2012-06-04 Thread Mel Flynn

On 4-6-2012 9:54, Pegasus Mc Cleaft wrote:

 I forgot to switch back to base gcc before
 compiling the nvidia driver. I installed the driver and rebooted, xorg
 came up but as soon as I logged in, kwin crashed, then the machine
 kernel panicked and rebooted. I rebooted and crashed a few more times
 before realizing what I had done. I recompiled the driver with GCC and
 the machine has been rock-stable with the nvidia driver.
 
 Maybe the OP has done the same thing and not realized it?

That would account for what I'm seeing too and I know I've compiled
nvidia-driver with clang. Except in my case it's Xfce4 and no panics,
but completely unusable (fonts were too large, offsets calculated
wrongly, terminal emulation was like opening something in vi with
TERM=dumb, mouse movement shocked, machine under constant load).

-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: Activating libssp

2012-05-29 Thread Mel Flynn

On 28-5-2012 23:22, Jeremie Le Hen wrote:
 Hi Mel,
 
 On Sun, May 27, 2012 at 08:15:02PM +0200, Mel Flynn wrote:
 Hi,

 for a port, I'm seeing:
 #ifdef _FORTIFY_SOURCE
 ...
 #endif

 I did a bit of reading (http://wiki.debian.org/Hardening) for example,
 searching through /usr/share/mk/* /usr/include/libssp, /usr/src/gnu/libssp.

 However, it's not clear to me, where the magic is that pulls in the
 libssp library that is in /lib.

 Also - it seems to be part of gcc, so does that mean on systems without
 gcc, that this library is not available or does clang have a variant?
 
 gnu/lib/libssp is built for compatibility reasons.  See
 http://svnweb.freebsd.org/base?view=revisionrevision=169718

This clarifies a bit about the existence of libssp as a shared lib. Thanks.

 Our libc provides the necessary symbols.
 http://svnweb.freebsd.org/base/head/lib/libc/sys/stack_protector.c
 
 I do see -fstack-protector is added to CFLAGS by default, so I'm
 thinking there's some magic somewhere, but I'm just missing the docs
 that tell me if you add foo to CFLAGS then bar will happen, unless baz.
 
 I'm not sure what you mean, but -fstack-protector is documented in GCC
 documentation, I suppose it's the same for Clang but I didn't check.
 You can disable it on FreeBSD by setting WITHOUT_SSP in src.conf(5).

Right, I wasn't very clear with that, so let me clarify:
- _FORTIFY_SOURCE is used in /usr/include/ssp/ssp.h
- There is a shared library /lib/libssp.so
- In the sources of the software there is no mention of ssp.h or -lssp
- In the sources of the software there are conditionals based on
_FORTIFY_SOURCE being defined.

So, for me as port maintainer, it looks as though adding
-D_FORTIFY_SOURCE=2 does absolutely nothing for the software, unless I
also #include ssp/ssp.h and add -lssp to LDFLAGS, unless there's some
magic in libc or the compiler that activates bits and overrides the
definitions for the symbols.
Based on the commit message, I assume that adding _FORTIFY_SOURCE to
CFLAGS does nothing, as the actual setting of this flag is compiled into
libc.
And -fstack-protector tells the compiler to activate the stack protector
callbacks that are again, implemented in libc. Without this, they won't
be activated. Does this sound correct?

As a side note, the code in question does not apply to FreeBSD at all as
we have no __FD_SETSIZE symbol anywhere that I can find:
#ifndef _FORTIFY_SOURCE
#include bits/types.h
#undef __FD_SETSIZE
#define __FD_SETSIZE 8192
#endif
so I'm patching that to read FD_SETSIZE or add -DFD_SETSIZE=8192 to
CFLAGS (not sure which I'll use yet).
-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Activating libssp

2012-05-27 Thread Mel Flynn

Hi,

for a port, I'm seeing:
#ifdef _FORTIFY_SOURCE
...
#endif

I did a bit of reading (http://wiki.debian.org/Hardening) for example,
searching through /usr/share/mk/* /usr/include/libssp, /usr/src/gnu/libssp.

However, it's not clear to me, where the magic is that pulls in the
libssp library that is in /lib.
Also - it seems to be part of gcc, so does that mean on systems without
gcc, that this library is not available or does clang have a variant?

I do see -fstack-protector is added to CFLAGS by default, so I'm
thinking there's some magic somewhere, but I'm just missing the docs
that tell me if you add foo to CFLAGS then bar will happen, unless baz.
-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: GSoC Project: Automated Kernel Crash Reporting System - Discussion

2012-05-19 Thread Mel Flynn

On 19-5-2012 5:54, Tim Kientzle wrote:
 
 On May 18, 2012, at 7:51 AM, Mel Flynn wrote:
 
 On 17-5-2012 14:53, Mateusz Guzik wrote:
 On Wed, May 16, 2012 at 11:37:44PM +0300, tza...@it.teithe.gr wrote:

 Nice. What about curl over the HTTPS protocol?


 curl would be ok, except it's not in the base system.

 For this reason, it's probably best to use tar(1) to package up multiple
 files and implement http put support in libfetch(3). You may also need
 to implement 305 Use Proxy support.
 
 Depends on where the files are coming from.  If you
 have files on disk, then tar(1) might be a good choice.
 If you're going to have to construct the files, then you
 can maybe avoid writing them to disk by using libarchive(3)
 directly instead of going through the tar command-line
 interface.

As I read the original intent is to post crashdumps at a specified
remote location through rc(8) using an sh(1) script on the next reboot.
tar seemed appropriate.

I'm only mentioning extending libfetch(3), because it will be easy for
fetch(1) to pick it up, it benefits more than just this project and once
integrated into fetch(1) can be used in said script above.

Other than openssh we don't really have a good tool in the base system
to put local files elsewhere securely. Also, if the BUGS section of
fetch(3) is out of date, I'm happy to be corrected :)
-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: GSoC Project: Automated Kernel Crash Reporting System - Discussion

2012-05-18 Thread Mel Flynn

On 17-5-2012 14:53, Mateusz Guzik wrote:
 On Wed, May 16, 2012 at 11:37:44PM +0300, tza...@it.teithe.gr wrote:

 Nice. What about curl over the HTTPS protocol?

 
 curl would be ok, except it's not in the base system.

For this reason, it's probably best to use tar(1) to package up multiple
files and implement http put support in libfetch(3). You may also need
to implement 305 Use Proxy support.
-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Debugging zombies: pthread_sigmask and sigwait

2012-04-11 Thread Mel Flynn

Hi,

I'm currently stuck on a bug in Zarafa-spooler that creates zombies. and
working around it by claiming that our pthread library isn't normal
which uses standard signals rather then a signal thread.

My limited understanding of these facilities is however not enough to
see the actual problem here and reading of related manpages did not lead
me to a solution either. A test case reproducing the problem is attached.

What happens is that SIGCHLD is never received by the signal thread and
the child processes turn to zombies. Signal counters never go up, not
even for SIGINFO, which I added specifically to see if anything gets
through at all.

The signal thread shows being stuck in sigwait. It's reproducible on
8.3-PRERELEASE of a few days ago (r233768). I'm not able to test it on
anything newer unfortunately, but I suspect this is a bug/linuxism in
the code not in FreeBSD.

Thanks in advance for any insights.
-- 
Mel
PROG=spoolerbug
NO_MAN=yes
DEBUG_FLAGS=-g3
WARNS=6
WITH_DEBUG=yes
LDFLAGS+=-pthread

.include ../mk/core.mk
.include bsd.prog.mk
/*
 * vim: ts=4 sw=4 tw=78 noet ai fdm=marker
 */
#include sys/cdefs.h
__FBSDID($FreeBSD$);

#include sys/types.h
#include sys/wait.h

#include pthread.h
#include signal.h /* signal related */
#include unistd.h /* vfork */

#include stdlib.h /* arc4random() */
#include stdbool.h
#include getopt.h

#include stdio.h /* printing */

#include err.h

#define SERVER_ITERATIONS 3

/* declarations */
void *signal_handler(void *);
int running_server(void);
void process_signal(int);

/* globals */
pthread_t   signal_thread;
sigset_tsignal_mask;
boolbQuit = false;
pid_t   lastPid = 0;
char*szCommand;
size_t  n_sigs_handled = 0;
size_t  n_sigs_child = 0;
size_t  n_sigs_info = 0;

void *
signal_handler(void *args __unused)
{
int sig;

while( !bQuit  sigwait(signal_mask, sig) == 0 )
{
n_sigs_handled++;
process_signal(sig);
}

return NULL;
}

int
running_server(void)
{
u_int32_t r, max = 10;
pid_t pid, me;
int i = 0;

me = getpid();
warnx([master]: Send SIGINFO to %u, (unsigned)me);
do
{
warnx([master]: lastPid = %u, n_sigs_handled=%zu, 
n_sigs_child=%zu
n_sigs_info=%zu, (unsigned)lastPid, 
n_sigs_handled,
n_sigs_child, n_sigs_info);
pid = vfork();
if( pid  0 )
break;
if( pid == 0 )
{
execl(szCommand, getprogname(), -F, NULL);
_exit(EXIT_FAILURE);
}
else
{
if( bQuit )
break;
warnx([master]: Child spawned with pid %u, 
(unsigned)pid);
r = arc4random() % max;
sleep((unsigned int)r);
}
} while( !bQuit  i++  SERVER_ITERATIONS );
return (0);
}

void
process_signal(int sig)
{
int stat;
pid_t pid;

switch(sig)
{
case SIGTERM:
case SIGINT:
bQuit = true;
break;
case SIGCHLD:
n_sigs_child++;
while( (pid = waitpid(-1, stat, WNOHANG))  0)
{
lastPid = pid;
}
break;
case SIGINFO:
n_sigs_info++;
break;
default:
signal(sig, SIG_IGN);
break;
}
}

int
main(int argc, char *argv[])
{
bool bForked = false;
const char *opts = F;
int ch, hr, rc;

szCommand = argv[0];
while( (ch = getopt(argc, argv, opts)) != -1 )
{
if( ch == 'F' )
bForked = true;
}

argc -= optind;
argv += optind;

if( !bForked )
{
sigemptyset(signal_mask);
sigaddset(signal_mask, SIGTERM);
sigaddset(signal_mask, SIGINT);
sigaddset(signal_mask, SIGCHLD);
sigaddset(signal_mask, SIGINFO);
}

daemon(1, 1);
if( !bForked )
{
rc = pthread_sigmask(SIG_BLOCK, signal_mask, NULL);
if( rc != 0 )
err(EXIT_FAILURE, pthread_sigmask());

pthread_create(signal_thread, NULL, signal_handler, NULL);
hr = running_server();
warnx([master]: Joining signal thread);
pthread_join(signal_thread, NULL);
}
else

Re: Debugging zombies: pthread_sigmask and sigwait

2012-04-11 Thread Mel Flynn

On 4/11/2012 16:26, Ian Lepore wrote:
 On Wed, 2012-04-11 at 16:11 +0200, Mel Flynn wrote:

 What happens is that SIGCHLD is never received by the signal thread and
 the child processes turn to zombies. Signal counters never go up, not
 even for SIGINFO, which I added specifically to see if anything gets
 through at all.

 The signal thread shows being stuck in sigwait. It's reproducible on
 8.3-PRERELEASE of a few days ago (r233768). I'm not able to test it on
 anything newer unfortunately, but I suspect this is a bug/linuxism in
 the code not in FreeBSD.

 The signal mask for a new thread is inherited from the parent thread.
 In your example code, the signal handling thread inherits the blocked
 status of the signals as set up in main().  Try adding this line to
 signal_handler() before it goes into its while() loop:
 
  pthread_sigmask(SIG_UNBLOCK, signal_mask, NULL);

That doesn't change anything and is in contrast to what sigwait(2) says:

 The signals specified by set /should be blocked/ at the time of the
 call to sigwait().

I also thought about a different child touching the signal code and two
processes blocked in sigwait in the original code (they fork a logger
process prior to sigemptyset()), but I explicitly avoid that in the test
case.
-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: Debugging zombies: pthread_sigmask and sigwait

2012-04-11 Thread Mel Flynn

On 4/11/2012 16:47, Konstantin Belousov wrote:

 What happens, as I guess it, the SIGINFO and SIGCHLD are ignored, so
 kernel do not even bother to queue the signals to the master process.
 Register a dummy signal handler for your signals with sigaction
 before creating 'signal_handler' thread.

Right on the mark. I've modified the test code accordingly and things
work as expected. I've also applied the logic to the Zarafa spooler and
in the logs I'm finally seeing:
child: [79572] E-mail for user mel was accepted by SMTP server
parent: [79565] Received signal 20
^^

Many thanks and for the archives, the diff below sig.
-- 
Mel

diff -r 509d7301c720 spoolerbug/spoolerbug.c
--- a/spoolerbug/spoolerbug.c   Wed Apr 11 05:37:50 2012 -0800
+++ b/spoolerbug/spoolerbug.c   Wed Apr 11 07:35:50 2012 -0800
@@ -12,6 +12,7 @@
 #include unistd.h /* vfork */

 #include stdlib.h /* arc4random() */
+#include string.h /* memset() */
 #include stdbool.h
 #include getopt.h

@@ -25,6 +26,7 @@
 void *signal_handler(void *);
 int running_server(void);
 void process_signal(int);
+void signal_dummy(int);

 /* globals */
 pthread_t  signal_thread;
@@ -112,6 +114,12 @@
}
 }

+void
+signal_dummy(int sig __unused)
+{
+   return;
+}
+
 int
 main(int argc, char *argv[])
 {
@@ -131,11 +139,19 @@

if( !bForked )
{
+   struct sigaction dummies;
+
+   memset(dummies, 0, sizeof(dummies));
sigemptyset(signal_mask);
sigaddset(signal_mask, SIGTERM);
sigaddset(signal_mask, SIGINT);
sigaddset(signal_mask, SIGCHLD);
sigaddset(signal_mask, SIGINFO);
+   dummies.sa_handler = signal_dummy;
+   dummies.sa_mask = signal_mask;
+   dummies.sa_flags |= SA_NOCLDSTOP;
+   sigaction(SIGCHLD, dummies, NULL);
+   sigaction(SIGINFO, dummies, NULL);
}

daemon(1, 1);

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: How to use pfind in freeBSD

2012-03-16 Thread Mel Flynn

On 3/16/2012 08:09, kota saikrishna wrote:
 I need to get process data structure using a pid. I found the pfind
 function which returns struct proc *   but when i tried to use pfind
 function it is showing ---undefined reference to `pfind'
 Can any one suggest how to use pfind() function?

From userland, see kvm_openfiles(3) and kvm_getprocs(3).

-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: Jail on 2 interfaces?

2009-12-23 Thread Mel Flynn

On Wednesday 23 December 2009 01:19:23 Bjoern A. Zeeb wrote:
 On Tue, 22 Dec 2009, Mel Flynn wrote:
 
 Hi,
 
 first of all this would find more people to help on freebsd-jail as it
 has nothing to do with hackers ;-)

Yes, that was pretty braindead of me, especially since the intention was
questi...@.

  I don't see this documented in jail(8) nor rc(8) nor defaults/rc.conf, so
  is it possible to have 2 IP's on 2 ethernet interfaces? And if so, is it
  settable for rc(8)?
 
  The usage case is to have the same jailed proxy server on two seperate
  internal networks. Ideally, the proxy will use one address for outgoing,
  so I guess I'll need a default route or dive into the squid config.
 
  At present I have:
  ifconfig_bge0=inet 192.168.177.60  netmask 255.255.255.0
  ifconfig_em0=inet 192.168.176.60 netmask 255.255.255.0
  ifconfig_em0_alias0=inet 192.168.176.62 netmask 255.255.255.255
  jail_squid_rootdir=/usr/squid
  jail_squid_ip=192.168.177.62
  jail_squid_ip_multi0=192.168.176.62
  jail_squid_interface=bge0
 
  But this created the IP on bge0 even though one exists on em0. Is it as
  simple as not specifying the interface and add the 177.62 alias on bge0?
  Ideally I'd have a jail_$jail_ip_multi$aliasno_interface=foo0, but my
  main worry is that the jail infrastructure understands the routing
  involved.
 
 From what you are writing I assume that you are on FreeBSD 7.2-Release
 
 or later; no official FreeBSD version before had supported
 multiple-IPs with a jail.

8.0-p3, yes.

 What it did was what you were asking for.  That's the problem.
 
 1) either use ifconfig
 2) or use jail + interfaces
 3) but do not mix them (especially not overlapping)
 
 So I would suggest to do it like this:
 
 # Base system IPs.
 ifconfig_bge0=inet 192.168.177.60/24
 ifconfig_em0=inet 192.168.176.60/24
 
 jail_squid_rootdir=/usr/squid
 # Either use:
 jail_squid_ip=bge0|192.168.177.62/32,em0|192.168.176.62/32
 # or:
 jail_squid_ip=bge0|192.168.177.62/32
 jail_squid_ip_multi0=em0|192.168.176.62/32
 
 but do not use jail_squid_interface=.. as that will be a global
 default for that jail.

Is it a global *default* or a global? For example, could I specify:
jail_squid_interface=bge0
jail_squid_ip=192.168.177.62/32
jail_squid_ip_multi0=192.168.177.63/32
jail_squid_ip_multi1=em0|192.168.177.62/32

Below is a patch against HEAD to document the $interface|$ip syntax.
-- 
Mel

Index: etc/defaults/rc.conf
===
--- etc/defaults/rc.conf(revision 200901)
+++ etc/defaults/rc.conf(working copy)
@@ -648,6 +648,7 @@
 #jail_example_fib=0  # Routing table for setfib(1)
 #jail_example_ip=192.0.2.10,2001:db8::17 # Jail's primary IPv4 and IPv6 
address
 #jail_example_ip_multi0=2001:db8::10 #  and another IPv6 address
+#jail_example_ip_multi1=em0|192.0.3.10/32#  and another IPv4 address on 
a specific interface
 #jail_example_exec_start=/bin/sh /etc/rc # command to execute in 
jail for starting
 #jail_example_exec_afterstart0=/bin/sh command   # command to execute 
after the one for
# starting the jail. 
More than one can be
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Jail on 2 interfaces?

2009-12-22 Thread Mel Flynn

Hi,

I don't see this documented in jail(8) nor rc(8) nor defaults/rc.conf, so is 
it possible to have 2 IP's on 2 ethernet interfaces? And if so, is it settable 
for rc(8)?

The usage case is to have the same jailed proxy server on two seperate 
internal networks. Ideally, the proxy will use one address for outgoing, so I 
guess I'll need a default route or dive into the squid config.

At present I have:
ifconfig_bge0=inet 192.168.177.60  netmask 255.255.255.0
ifconfig_em0=inet 192.168.176.60 netmask 255.255.255.0
ifconfig_em0_alias0=inet 192.168.176.62 netmask 255.255.255.255
jail_squid_rootdir=/usr/squid
jail_squid_ip=192.168.177.62
jail_squid_ip_multi0=192.168.176.62
jail_squid_interface=bge0

But this created the IP on bge0 even though one exists on em0. Is it as simple 
as not specifying the interface and add the 177.62 alias on bge0?
Ideally I'd have a jail_$jail_ip_multi$aliasno_interface=foo0, but my main 
worry is that the jail infrastructure understands the routing involved.
-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: Superpages on amd64 FreeBSD 7.2-STABLE

2009-12-07 Thread Mel Flynn

On Thursday 26 November 2009 18:11:10 Linda Messerschmidt wrote:

 I did not mean to suggest that we were asking for help solving a
 problem with squid rotation.  I provided that information as
 background to discuss what we observed as a potential misbehavior in
 the new VM superpages feature, in the hope that if there is a problem
 with the new feature, we can help find/resolve it or, if this is
 working as intended, hopefully gain some insight as to what's going
 on.

I tend to agree with this, though I don't know the nitty gritty of the 
implementation, it seems that:
a) superpages aren't copied efficiently (at all?) on fork and probably other 
workloads
b) vfork is encouraged for memory intensive applications, yet:
BUGS
 This system call will be eliminated when proper system sharing mechanisms
 are implemented.  Users should not depend on the memory sharing semantics
 of vfork() as it will, in that case, be made synonymous to fork(2).

So is this entire problem eliminated when system sharing mechanisms are in 
place and vfork considered the temporary work around or is copying of 
superpages a problem that remains?
-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: Issue with grep -i (on i386 only?)

2009-11-04 Thread Mel Flynn

On Wednesday 04 November 2009 04:05:44 Eygene Ryabinkin wrote:
 Mel, good day.
 
 Tue, Nov 03, 2009 at 09:22:28PM +0100, Mel Flynn wrote:
  So on the laptop I modified the testscript as it is attached now and
  while there is still a significant delay, the wallclock time is less
  then half, when the expression is rewritten with the same meaning:
  = 16777216
  = fgrep
  0.04 real 0.03 user 0.00 sys
  0.05 real 0.03 user 0.01 sys
  0.02 real 0.00 user 0.00 sys
  = pcregrep
  0.26 real 0.21 user 0.02 sys
  0.26 real 0.22 user 0.02 sys
  0.44 real 0.35 user 0.01 sys
  = grep
  0.04 real 0.04 user 0.00 sys
  4.45 real 4.15 user 0.01 sys
  2.00 real 1.81 user 0.00 sys -- [fF][Oo][Oo]
 
 Just did a quick test on the 8.0-RC2/i386 with very old Athlon processor:
 -
 = 16777216
 = fgrep
 0,09 real 0,04 user 0,05 sys
 0,18 real 0,06 user 0,03 sys
 0,05 real 0,01 user 0,04 sys
 = pcregrep
 0,47 real 0,29 user 0,07 sys
 0,52 real 0,33 user 0,07 sys
 0,77 real 0,45 user 0,03 sys
 = grep
 0,09 real 0,08 user 0,01 sys
 0,10 real 0,04 user 0,05 sys
 0,23 real 0,12 user 0,03 sys
 -
 Pattern for the plain 'grep' is stable: first and second variants always
 give the same time within a 0.01 second variation and the last variant
 gives 2x slowdown.
 
 I tried sizes up to the 64M -- the pattern stays.  The same stuff for
 the amd64, so in my case I don't see the difference in behaviour.  So,
 maybe, the problem isn't 32 vs 64 but lies somewhere else.

Well, just ruled out the last commonality: The i386 machines tested all had 
MAXPHYS to 1M, except the one I just tried:
= 16777216
= fgrep
0.04 real 0.03 user 0.00 sys
0.04 real 0.03 user 0.00 sys
0.02 real 0.00 user 0.01 sys
= grep
0.04 real 0.02 user 0.02 sys
3.70 real 3.56 user 0.00 sys
1.91 real 1.83 user 0.02 sys

Using env MALLOC_OPTIONS= also has no impact at all (just in case defaults 
aren't that). Since fgrep is fast and basically seeds the cache for grep, I'm 
ruling out disks/io reads. In fact, /tmp on this laptop is memory disk (one 
reason I couldn't go up to 64M :)). I honestly can't figure out what my 'local 
problem' could be or your optimization.

Thanks for the fix ups. One more below sig.
-- 
Mel

--- grep-test.sh.orig   2009-11-04 03:17:05.0 -0900
+++ grep-test.sh2009-11-04 03:29:55.0 -0900
@@ -34,6 +34,10 @@
;;
esac

+   if [ ! -f ${TMPFILE} ]; then
+   # signalled
+   exit 0;
+   fi
jot -r -c ${b} a z |rs -g 0 20  ${TMPFILE}
echo = ${b}
for prog in fgrep ${PCREGREP} ${BSDGREP} grep ; do
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Grep -i and UTF-8 (Was: Re: Issue with grep -i (on i386 only?))

2009-11-04 Thread Mel Flynn

On Wednesday 04 November 2009 13:49:54 Mel Flynn wrote:

 Using env MALLOC_OPTIONS= also has no impact at all (just in case defaults
 aren't that). Since fgrep is fast and basically seeds the cache for grep,
  I'm ruling out disks/io reads. In fact, /tmp on this laptop is memory disk
  (one reason I couldn't go up to 64M :)). I honestly can't figure out what
  my 'local problem' could be or your optimization.

It hit me. Rather then a local problem, it's a locale problem:
= 16777216
= en_US.UTF-8
= fgrep
0.04 real 0.04 user 0.00 sys
0.04 real 0.02 user 0.02 sys
0.02 real 0.01 user 0.00 sys
= grep
0.04 real 0.04 user 0.00 sys
3.74 real 3.55 user 0.02 sys
1.95 real 1.83 user 0.03 sys
= en_US.ISO8859-1
= fgrep
0.04 real 0.04 user 0.00 sys
0.04 real 0.03 user 0.00 sys
0.02 real 0.01 user 0.01 sys
= grep
0.05 real 0.03 user 0.00 sys
0.05 real 0.04 user 0.00 sys
0.08 real 0.04 user 0.03 sys
= en_US.US-ASCII
= fgrep
0.04 real 0.01 user 0.02 sys
0.05 real 0.03 user 0.01 sys
0.02 real 0.00 user 0.02 sys
= grep
0.04 real 0.03 user 0.00 sys
0.05 real 0.03 user 0.00 sys
0.08 real 0.06 user 0.01 sys

-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Issue with grep -i (on i386 only?)

2009-11-03 Thread Mel Flynn

Hi,

attached a little test script for grep's -i performance. I tried a few 
different machines and the 64-bit 7.2 machine I could steal doesn't seem to be 
affected and out performs pcregrep.
On i386 machines, grep -i is significantly slower:
i386, 7.2-STABLE of Sep 8, load averages: 0.00, 0.02, 0.00,
Mem: 336M Active, 442M Inact, 217M Wired, 38M Cache, 112M Buf, 198M Free
dev.cpu.0.freq: 2992 (Intel P-IV HTT enabled)
16Meg file result:
= 16777216
= fgrep
0.04 real 0.02 user 0.01 sys
0.04 real 0.03 user 0.01 sys
= pcregrep
0.21 real 0.19 user 0.02 sys
0.21 real 0.20 user 0.00 sys
= grep
0.04 real 0.02 user 0.01 sys  not -i
3.64 real 3.61 user 0.01 sys  -i

i386, 8.0-RC1 FreeBSD 8.0-RC1 #15 r197337M, load averages: 1.61, 1.35, 1.12
Mem: 920M Active, 87M Inact, 215M Wired, 69M Cache, 112M Buf, 195M Free
dev.cpu.0.freq: 1733 (Intel dual core laptop)
16Meg file result:
= 16777216
= fgrep
0.04 real 0.02 user 0.01 sys
0.05 real 0.04 user 0.00 sys
= pcregrep
0.26 real 0.23 user 0.01 sys
0.29 real 0.24 user 0.00 sys
= grep
0.04 real 0.04 user 0.00 sys
4.73 real 4.15 user 0.01 sys

amd64, 7.2-RELEASE-p4 #1 r198384M, load averages: 0.00, 0.00, 0.00
Mem: 115M Active, 182M Inact, 264M Wired, 101M Cache, 213M Buf, 1311M Free
CPU: Dual-Core AMD Opteron(tm) Processor 2210 (1800.08-MHz K8-class CPU)
64Meg file result:
= 67108864
= fgrep
0.18 real 0.13 user 0.04 sys
0.19 real 0.17 user 0.02 sys
= pcregrep
0.89 real 0.85 user 0.03 sys
0.98 real 0.92 user 0.06 sys
= grep
0.18 real 0.16 user 0.01 sys
0.19 real 0.16 user 0.03 sys


So on the laptop I modified the testscript as it is attached now and while 
there is still a significant delay, the wallclock time is less then half, when 
the expression is rewritten with the same meaning:
= 16777216
= fgrep
0.04 real 0.03 user 0.00 sys
0.05 real 0.03 user 0.01 sys
0.02 real 0.00 user 0.00 sys
= pcregrep
0.26 real 0.21 user 0.02 sys
0.26 real 0.22 user 0.02 sys
0.44 real 0.35 user 0.01 sys
= grep
0.04 real 0.04 user 0.00 sys
4.45 real 4.15 user 0.01 sys
2.00 real 1.81 user 0.00 sys -- [fF][Oo][Oo]

So it looks to me that, while there is a problem with case insensitive 
comparison, just rewriting the expression is an optimization grep could 
perform.
Either way, with the new text tools being written (done?) is this problem 
being attacked, not fixable due to specifications or not considered an issue?
Any PR's needed / I missed? Patches to try?

[And it just occured to me bsdgrep is in ports]:
= bsdgrep
0.93 real 0.74 user 0.00 sys
4.80 real 4.33 user 0.02 sys
4.97 real 4.34 user 0.01 sys

So here the optimization does not fly.
-- 
Mel
#!/bin/sh
# vim: ts=4 sw=4 noet tw=78 ai

PCREGREP=`which pcregrep`
BSDGREP=`which bsdgrep`
[ -n ${PCREGREP} ]  PCREGREP=`basename ${PCREGREP}`
[ -n ${BSDGREP} ]  BSDGREP=`basename ${BSDGREP}`

me=`basename $0`
BYTES=1048576 2097152 4194304 8388608 16777216
if [ ! -x /usr/bin/jot ]; then
echo Need jot
exit 1
fi
if [ ! -x /usr/bin/rs ]; then
echo Need rs
exit 1
fi

for b in ${BYTES}; do
TMPFILE=`mktemp -t ${me}`
if [ ! -f ${TMPFILE} ]; then
echo Can\'t create tmp files in ${TMPDIR:=/tmp}
exit 2
fi
jot -r -c ${b} a z |rs -g 0 20  ${TMPFILE}
echo = ${b}
for prog in fgrep ${PCREGREP} ${BSDGREP} grep ; do
echo = ${prog}
/usr/bin/time ${prog} foo ${TMPFILE} /dev/null
/usr/bin/time ${prog} -i foo ${TMPFILE} /dev/null
/usr/bin/time ${prog} '[fF][Oo][Oo]' ${TMPFILE} /dev/null
done
rm ${TMPFILE}
done

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: Issue with grep -i (on i386 only?)

2009-11-03 Thread Mel Flynn

On Tuesday 03 November 2009 22:19:05 Gabor Kovesdan wrote:
 Mel Flynn escribió:
  Hi,
 
  attached a little test script for grep's -i performance. I tried a few
  different machines and the 64-bit 7.2 machine I could steal doesn't seem
  to be affected and out performs pcregrep.
 
 Note, that pcregrep isn't POSIX regex so it's not a good base of
 comparison. PCRE provides a POSIX-compliant interface to deal with
 Perl-compatible regex for those, who are already familiar with the
 former but it's still Perl regex and not POSIX! That's why some people
 get confused when PCRE comes to the topic.

I realize this, but for the case in question it does not matter. Both 
'regexes' should do the same in PCRE and POSIX. I provided the comparison to 
show that the 'problem of case insensitive comparison' is solvable, at the 
very least for the simple case.

  On i386 machines, grep -i is significantly slower:
  i386, 7.2-STABLE of Sep 8, load averages: 0.00, 0.02, 0.00,
  Mem: 336M Active, 442M Inact, 217M Wired, 38M Cache, 112M Buf, 198M Free
  dev.cpu.0.freq: 2992 (Intel P-IV HTT enabled)
  16Meg file result:
  = 16777216
  = fgrep
  0.04 real 0.02 user 0.01 sys
  0.04 real 0.03 user 0.01 sys
  = pcregrep
  0.21 real 0.19 user 0.02 sys
  0.21 real 0.20 user 0.00 sys
  = grep
  0.04 real 0.02 user 0.01 sys  not -i
  3.64 real 3.61 user 0.01 sys  -i
 
 It's an interesting observation, I have never heard of this.
 
  So it looks to me that, while there is a problem with case insensitive
  comparison, just rewriting the expression is an optimization grep could
  perform.
  Either way, with the new text tools being written (done?) is this problem
  being attacked, not fixable due to specifications or not considered an
  issue? Any PR's needed / I missed? Patches to try?
 
  [And it just occured to me bsdgrep is in ports]:
  = bsdgrep
  0.93 real 0.74 user 0.00 sys
  4.80 real 4.33 user 0.02 sys
  4.97 real 4.34 user 0.01 sys
 
  So here the optimization does not fly.
 
 Unfortunately, this is the most important issue with BSDL texttools. In
 the grep case, the BSDL version is ready and feature-complete but the
 performance isn't quite satisfying. The main reason of this is GNU grep
 uses a lot of shortcuts, which results in a bloated code (8000 LOC),
 while BSDL grep keeps everything simple and straightforward (1500 LOC).
 IMO, the desired solution would be to keep grep small and get a modern
 regex library for FreeBSD, which performs well. Pushing regex
 optimizations into grep is a bad idea because it not just makes the code
 bloated but other regex users won't benefit from the optimization so the
 problem should be fixed at its roots. And the current regex library we
 have is old, slow and doesn't support wchar, at all.

With this kind of difference, I don't really care who performs the 
optimization, but it seems that multiple options at the same character spot is 
not handled very well, with an extra penalty for case insensitive.
Why this isn't present on my 64-bit machine is a bit of a mystery to me, but 
since almost no time is spent in sys, I can't blame it on kernel.

 Btw, do you mind if I include your script into the BSD grep
 distribution? I already planned to write something like this for future
 testing.

Consider it public domain.
-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: Running a program through gdb without interfering

2009-10-09 Thread Mel Flynn

On Friday 09 October 2009 11:38:29 Dag-Erling Smørgrav wrote:
 Mel Flynn mel.flynn+fbsd.hack...@mailing.thruhere.net writes:
  is there a way to have a program run through gdb and gdb only record a
  segfault, but otherwise let the program run?
 
 Yes, just run gdb /path/to/program and type run.

Not what I was looking for. The segfaults are random and the only way to 
somewhat reliably reproduce it is to have portmaster invoke it as it's 
PM_SU_CMD. And no, running that same command again doesn't trigger the 
segfault, so it's something environmental. Hence I'm looking for something 
like:
gdb -batch -x script_with_run_cmd.gdb -exec /usr/local/bin/sudo $argv

where somehow I need $argv to be passed as arguments to sudo. I'm thinking i 
should just wrap it and mktemp(1) a new command script for gdb to use with set 
args $*, but if anyone has a more clever idea, I'd love to hear it.

  [...] sudo *sometimes* segfaults [...] However, it doesn't dump core
 
 sudo(1) is setuid root.  You need to set kern.sugid_coredump to get it
 to dump core.

It still segfaults and doesn't dump:
Oct  9 04:34:18 smell kernel: pid 39476 (sudo), uid 0: exited on signal 11
Oct  9 04:36:32 smell kernel: pid 79657 (sudo), uid 0: exited on signal 11
Oct  9 04:36:43 smell kernel: pid 82390 (sudo), uid 0: exited on signal 11
Oct  9 04:51:46 smell kernel: pid 3601 (sudo), uid 0: exited on signal 11

find / -name '*.core' in the jail does not yield anything. 

  [1] In order to get this working I had to put a statically compiled ps in
  the jail, or the uid test would fail. It has the downside that it lists
  both jail and host processes, [...]
 
 Uh, no.  Processes outside the jail are not visible inside it, no matter
 what version of ps(1) or top(1) or any other such program you use.

I'll write this off as pilot error, cause I cannot reproduce it. I saw bash as 
one of the processes listed in a blank ps run, which isn't installed in the 
jail, but since I don't have the terminal history anymore, it's entirely 
possible I ran ps on the host.
-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: Running a program through gdb without interfering

2009-10-09 Thread Mel Flynn

On Friday 09 October 2009 16:50:04 Mel Flynn wrote:
 On Friday 09 October 2009 11:38:29 Dag-Erling Smørgrav wrote:
  Mel Flynn mel.flynn+fbsd.hack...@mailing.thruhere.net writes:

   [...] sudo *sometimes* segfaults [...] However, it doesn't dump core
 
  sudo(1) is setuid root.  You need to set kern.sugid_coredump to get it
  to dump core.
 
 It still segfaults and doesn't dump:
 Oct  9 04:34:18 smell kernel: pid 39476 (sudo), uid 0: exited on signal 11
 Oct  9 04:36:32 smell kernel: pid 79657 (sudo), uid 0: exited on signal 11
 Oct  9 04:36:43 smell kernel: pid 82390 (sudo), uid 0: exited on signal 11
 Oct  9 04:51:46 smell kernel: pid 3601 (sudo), uid 0: exited on signal 11
 
 find / -name '*.core' in the jail does not yield anything.

FYI, there's one read-only mount into the jail, being /usr/src. I don't see a 
reason given the commands it segfaults on, for $cwd to be below that.For 
example it segfaulted on sudo pkg_delete glproto2.
Thought I'd mention it to rule it out. 
-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: Running a program through gdb without interfering

2009-10-09 Thread Mel Flynn

On Friday 09 October 2009 16:50:04 Mel Flynn wrote:
 On Friday 09 October 2009 11:38:29 Dag-Erling Smørgrav wrote:
  Mel Flynn mel.flynn+fbsd.hack...@mailing.thruhere.net writes:
   is there a way to have a program run through gdb and gdb only record a
   segfault, but otherwise let the program run?
 
  Yes, just run gdb /path/to/program and type run.
 
 Not what I was looking for. The segfaults are random and the only way to
 somewhat reliably reproduce it is to have portmaster invoke it as it's
 PM_SU_CMD. And no, running that same command again doesn't trigger the
 segfault, so it's something environmental. Hence I'm looking for
  something like:
 gdb -batch -x script_with_run_cmd.gdb -exec /usr/local/bin/sudo $argv
 
 where somehow I need $argv to be passed as arguments to sudo. I'm thinking
  i should just wrap it and mktemp(1) a new command script for gdb to use
  with set args $*, but if anyone has a more clever idea, I'd love to hear
  it.

Dead end path :/
% bin/gdbsudo echo hi
/tmp/gdbsudo.F3kdwJ:1: Error in sourced command file:
/usr/local/bin/sudo: Permission denied.

% ls -l /usr/local/bin/sudo
---s--x--x  2 root  wheel  116380 Oct  8 18:31 /usr/local/bin/sudo

% sudo chmod g+r /usr/local/bin/sudo

% bin/gdbsudo echo hi

(no debugging symbols found)...(no debugging symbols found)...(no debugging 
symbols found)...(no debugging symbols found)...sudo: must be setuid root

Program exited with code 01.

Perhaps the cause of it not dumping core either. Would've been nice to know 
why it segfaults, but not nice enough to keep digging.
-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: Running a program through gdb without interfering

2009-10-09 Thread Mel Flynn

On Friday 09 October 2009 21:27:21 Dag-Erling Smørgrav wrote:
 Mel Flynn mel.flynn+fbsd.hack...@mailing.thruhere.net writes:
  Dag-Erling Smørgrav d...@des.no writes:
   Yes, just run gdb /path/to/program and type run.
 
  Not what I was looking for. The segfaults are random and the only way to
  somewhat reliably reproduce it is to have portmaster invoke it as it's
  PM_SU_CMD. And no, running that same command again doesn't trigger the
  segfault, so it's something environmental. Hence I'm looking for
  something like:
  gdb -batch -x script_with_run_cmd.gdb -exec /usr/local/bin/sudo $argv
 
  where somehow I need $argv to be passed as arguments to sudo. I'm
  thinking i should just wrap it and mktemp(1) a new command script for gdb
  to use with set args $*, but if anyone has a more clever idea, I'd love
  to hear it.
 
 Why look for a clever option, when the simple one will do just fine?

Cause I don't know how much of the cause of this bug I'm influencing. Even 
though this is now the simple solution, it would be simpler if gdb (or another 
debugger) could work similar as sudo, where it would take the first argument 
as binary and the rest as arguments to the binary. This would do away with 
some extra IO I'm now creating. Though, it's unlikely it is related to IO, 
there is no pattern that I've found yet for the segfault, so I'm trying to 
limit any extra stuff.

I'll patch the kernel tomorrow with the new sysctl and see how far that gets 
me.


 Add 'ulimit -c unlimited' somewhere in the script before it invokes sudo.

I'll add it.
-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Running a program through gdb without interfering

2009-10-08 Thread Mel Flynn

Hi,

is there a way to have a program run through gdb and gdb only record a 
segfault, but otherwise let the program run?

Why I'd like this is the following:
I've got a i386 jail on an amd64 box, running 7.2-p4. UNAME_p and UNAME_m have 
been set to i386 as well as ARCH in /etc/make.conf. Running portmaster[1] to 
build ports under my uid and PM_SU_CMD, sudo *sometimes* segfaults. It's only 
sudo, so at present I don't have a reason to doubt memory. However, it doesn't 
dump core, so I'm at a loss what the culprit could be.

[1] In order to get this working I had to put a statically compiled ps in the 
jail, or the uid test would fail. It has the downside that it lists both jail 
and host processes, but it is acceptable to me as the jail is only accessible 
from the host (pf enforced). I suspect sudo to have a similar problem or even 
related to ps returning processes from a uid that doesn't exist in the jail, 
but without a backtrace I don't know what to fix.
-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: Running a program through gdb without interfering

2009-10-08 Thread Mel Flynn

On Friday 09 October 2009 00:38:32 Paul B Mahol wrote:
 On 10/9/09, Mel Flynn mel.flynn+fbsd.hack...@mailing.thruhere.net wrote:
  Hi,
 
  is there a way to have a program run through gdb and gdb only record a
  segfault, but otherwise let the program run?
 
  Why I'd like this is the following:
  I've got a i386 jail on an amd64 box, running 7.2-p4. UNAME_p and UNAME_m
  have
  been set to i386 as well as ARCH in /etc/make.conf. Running portmaster[1]
  to build ports under my uid and PM_SU_CMD, sudo *sometimes* segfaults.
  It's only
  sudo, so at present I don't have a reason to doubt memory. However, it
  doesn't
  dump core, so I'm at a loss what the culprit could be.
 
 Tried 'sysctl kern.sugid_coredump=1' ?

Hmm, no. Enabled now and waiting for the next segfault.
I actually looked at the sysctl -d, but it didn't register that this could be 
the main cause.
Perhaps that sentence could be more clear:
-kern.sugid_coredump: Enable coredumping set user/group ID processes
+kenr.sugid_coredump: Allow setuid/setgid processes to dump core

-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Spot the error

2009-08-04 Thread Mel Flynn

Hi,

granted it took me less then a minute to figure it out, but the error is kind 
of not helping:
% mount -t msdofs /dev/label/camera ~/camera
mount: /dev/label/camera : Operation not supported by device

I would expect something along the lines of unknown file system type. Is 
this fixable?
-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: Spot the error

2009-08-04 Thread Mel Flynn

On Tuesday 04 August 2009 13:43:24 Dimitry Andric wrote:
 On 2009-08-04 22:45, Mel Flynn wrote:
  % mount -t msdofs /dev/label/camera ~/camera
  mount: /dev/label/camera : Operation not supported by device
 
  I would expect something along the lines of unknown file system type.
  Is this fixable?

 Yes, just use msdosfs instead. ;)  That said, it looks like ENODEV is
 returned by vfs_domount(), whenever the fs type is not found in the
 list of supported filesystems:

   [...]
   if (fsflags  MNT_ROOTFS)
   vfsp = vfs_byname(fstype);
   else
   vfsp = vfs_byname_kld(fstype, td, error);
   if (vfsp == NULL)
   return (ENODEV);
   [...]

 Note that in the case when vfs_byname_kld() gets called, the error it
 returns is silently thrown away. What a pity. :)

 In any case, you could paint a lot of bikesheds about which error code
 from errno.h would be most suited for this situation, unfortunately.

I would expect Unable to load fs:  + ENOENT. I was asking if this was 
fixable, cause it looked like the code has been abstracted to the point that 
specific errors were hard, but maybe I missed something.
-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: mmap/munmap with zero length

2009-07-20 Thread Mel Flynn

On Mon, 13 Jul 2009 16:39:09 -0400, John Baldwin j...@freebsd.org wrote:
 On Monday 13 July 2009 3:33:51 pm Tijl Coosemans wrote:
 On Monday 13 July 2009 20:28:08 John Baldwin wrote:
  On Sunday 05 July 2009 3:32:25 am Alexander Best wrote:
  so mmap differs from the POSIX recommendation right. the malloc.conf
  option seems more like a workaround/hack. imo it's confusing to have
  mmap und munmap deal differently with len=0. being able to
  succesfully alocate memory which cannot be removed doesn't seem
  logical to me.
  
  This should fix it:
  
  --- //depot/user/jhb/acpipci/vm/vm_mmap.c
  +++ /home/jhb/work/p4/acpipci/vm/vm_mmap.c
  @@ -229,7 +229,7 @@
  
  fp = NULL;
  /* make sure mapping fits into numeric range etc */
  -   if ((ssize_t) uap-len  0 ||
  +   if ((ssize_t) uap-len = 0 ||
  ((flags  MAP_ANON)  uap-fd != -1))
  return (EINVAL);
 
 Why not uap-len == 0? Sizes of 2GiB and more (32bit) shouldn't cause
 an error.
 
 I don't actually disagree and know of locally modified versions of
FreeBSD 
 that remove this check for precisely that reason.

If this has hit the tree recently, I think it broke ccache.

Since I've also done make delete-old-libs and was about to rebuild all my
ports on my laptop, I'll investigate, as I'm not looking forward to doing
this twice for all dependants of libtool :(.

Failed to mmap
/var/db/ccache/mel/tmp.cpp_stderr.smoochies.rachie.is-a-geek.net.27934

kdump:
 27934 ccache   CALL  open(0x28201280,O_RDONLY,unused0x1)
 27934 ccache   NAMI 
/var/db/ccache/mel/tmp.cpp_stderr.smoochies.rachie.is-a-geek.net.27934
 27934 ccache   RET   open 4
 27934 ccache   CALL  fstat(0x4,0xbfbfe7fc)
 27934 ccache   STRU  struct stat {dev=105, ino=895320, mode=-rw-r--r-- ,
nlink=1, uid=1003, gid=0, rdev=0, atime=1248069251, stime=1248069251,
ctime=1248069251, birthtime=1248069251, size=0, blksize=4096, blocks=0,
flags=0x0 }
 27934 ccache   RET   fstat 0
 27934 ccache   CALL  mmap(0,0,PROT_READ,MAP_PRIVATE,0x4,0,0)
 27934 ccache   RET   mmap -1 errno 22 Invalid argument

Sent from webmail, so excuse any formatting issues.
-- 
Mel

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: mmap/munmap with zero length

2009-07-20 Thread Mel Flynn

On Sun, 19 Jul 2009 22:13:48 -0800, Mel Flynn
mel.flynn+fbsd.hack...@mailing.thruhere.net wrote:
 On Mon, 13 Jul 2009 16:39:09 -0400, John Baldwin j...@freebsd.org wrote:
 On Monday 13 July 2009 3:33:51 pm Tijl Coosemans wrote:
 On Monday 13 July 2009 20:28:08 John Baldwin wrote:
  On Sunday 05 July 2009 3:32:25 am Alexander Best wrote:
  so mmap differs from the POSIX recommendation right. the malloc.conf
  option seems more like a workaround/hack. imo it's confusing to have
  mmap und munmap deal differently with len=0. being able to
  succesfully alocate memory which cannot be removed doesn't seem
  logical to me.
  
  This should fix it:
  
  --- //depot/user/jhb/acpipci/vm/vm_mmap.c
  +++ /home/jhb/work/p4/acpipci/vm/vm_mmap.c
  @@ -229,7 +229,7 @@
  
  fp = NULL;
  /* make sure mapping fits into numeric range etc */
  -   if ((ssize_t) uap-len  0 ||
  +   if ((ssize_t) uap-len = 0 ||
  ((flags  MAP_ANON)  uap-fd != -1))
  return (EINVAL);
 
 Why not uap-len == 0? Sizes of 2GiB and more (32bit) shouldn't cause
 an error.
 
 I don't actually disagree and know of locally modified versions of
 FreeBSD 
 that remove this check for precisely that reason.
 
 If this has hit the tree recently, I think it broke ccache.
 
 Since I've also done make delete-old-libs and was about to rebuild all my
 ports on my laptop, I'll investigate, as I'm not looking forward to doing
 this twice for all dependants of libtool :(.
 
 Failed to mmap
 /var/db/ccache/mel/tmp.cpp_stderr.smoochies.rachie.is-a-geek.net.27934
 
 kdump:
  27934 ccache   CALL  open(0x28201280,O_RDONLY,unused0x1)
  27934 ccache   NAMI 
 /var/db/ccache/mel/tmp.cpp_stderr.smoochies.rachie.is-a-geek.net.27934
  27934 ccache   RET   open 4
  27934 ccache   CALL  fstat(0x4,0xbfbfe7fc)
  27934 ccache   STRU  struct stat {dev=105, ino=895320, mode=-rw-r--r-- ,
 nlink=1, uid=1003, gid=0, rdev=0, atime=1248069251, stime=1248069251,
 ctime=1248069251, birthtime=1248069251, size=0, blksize=4096, blocks=0,
 flags=0x0 }
  27934 ccache   RET   fstat 0
  27934 ccache   CALL  mmap(0,0,PROT_READ,MAP_PRIVATE,0x4,0,0)
  27934 ccache   RET   mmap -1 errno 22 Invalid argument

Confirmed, attached patch fixes ccache. Probably should be patched in
patch-md4.

-- 
Mel

patch-mmap
Description: Binary data
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: large pages (amd64)

2009-06-30 Thread Mel Flynn

On Tuesday 30 June 2009 00:24:00 Alan Cox wrote:
 Mel Flynn wrote:
  On Sunday 28 June 2009 15:41:49 Alan Cox wrote:
  Wojciech Puchar wrote:
  how can i check how much (or maybe - what processes) 2MB pages are
  actually allocated?
 
  I'm afraid that you can't with great precision.  For a given program
  execution, on an otherwise idle machine, you can only estimate the
  number by looking at the change in the quantity promotions + mappings -
  demotions before, during, and after the program execution.
 
  A program can call mincore(2) in order to determine if a virtual address
  is part of a 2 or 4MB virtual page.
 
  Would it be possible to expose the super page count as kve_super in the
  kinfo_vmentry struct so that procstat can show this information? If only
  to determine if one is using the feature and possibly benefiting from it.

 Yes, I think so.

  It looks like sys/kern/kern_proc.c could call mincore around the loop at
  line 1601 (rev 194498), but I know nothing about the vm subsystem to know
  the implications or locking involved. There's still 16 bytes of spare to
  consume, in the kve_vminfo struct though ;)

 Yes, to start with, you could replace the call to pmap_extract() with a
 call to pmap_mincore() and export a Boolean to user space that says,
 This region of the address space contains one or more superpage mappings.

How about attached?

% sudo procstat -av|grep 'S '
  PID  STARTEND PRT  RES PRES REF SHD  FL TP PATH
 1754 0x2890 0x2ae0 rw- 93850   3   0 --S df
 2141 0x2f90 0x3080 rw- 37190   1   0 --S df
 2146 0x3eec 0x4fac rwx 17450   1   0 --S df

-- 
Mel

Index: sys/sys/user.h
===
--- sys/sys/user.h	(revision 195188)
+++ sys/sys/user.h	(working copy)
@@ -348,6 +348,7 @@
 
 #define	KVME_FLAG_COW		0x0001
 #define	KVME_FLAG_NEEDS_COPY	0x0002
+#define	KVME_FLAG_SUPER		0x0004
 
 #if defined(__amd64__)
 #define	KINFO_OVMENTRY_SIZE	1168
Index: sys/kern/kern_proc.c
===
--- sys/kern/kern_proc.c	(revision 195188)
+++ sys/kern/kern_proc.c	(working copy)
@@ -59,6 +59,7 @@
 #include sys/signalvar.h
 #include sys/sdt.h
 #include sys/sx.h
+#include sys/mman.h
 #include sys/user.h
 #include sys/jail.h
 #include sys/vnode.h
@@ -1599,8 +1600,13 @@
 		kve-kve_resident = 0;
 		addr = entry-start;
 		while (addr  entry-end) {
-			if (pmap_extract(map-pmap, addr))
+			int flags;
+
+			flags = pmap_mincore(map-pmap, addr);
+			if ( flags  MINCORE_INCORE )
 kve-kve_resident++;
+			if( flags  MINCORE_SUPER )
+kve-kve_flags |= KVME_FLAG_SUPER;
 			addr += PAGE_SIZE;
 		}
 
Index: usr.bin/procstat/procstat_vm.c
===
--- usr.bin/procstat/procstat_vm.c	(revision 195188)
+++ usr.bin/procstat/procstat_vm.c	(working copy)
@@ -49,7 +49,7 @@
 
 	ptrwidth = 2*sizeof(void *) + 2;
 	if (!hflag)
-		printf(%5s %*s %*s %3s %4s %4s %3s %3s %2s %-2s %-s\n,
+		printf(%5s %*s %*s %3s %4s %4s %3s %3s %3s %-2s %-s\n,
 		PID, ptrwidth, START, ptrwidth, END, PRT, RES,
 		PRES, REF, SHD, FL, TP, PATH);
 
@@ -69,8 +69,9 @@
 		printf(%3d , kve-kve_ref_count);
 		printf(%3d , kve-kve_shadow_count);
 		printf(%-1s, kve-kve_flags  KVME_FLAG_COW ? C : -);
-		printf(%-1s , kve-kve_flags  KVME_FLAG_NEEDS_COPY ? N :
+		printf(%-1s, kve-kve_flags  KVME_FLAG_NEEDS_COPY ? N :
 		-);
+		printf(%-1s , kve-kve_flags  KVME_FLAG_SUPER ? S : -);
 		switch (kve-kve_type) {
 		case KVME_TYPE_NONE:
 			str = --;
Index: usr.bin/procstat/procstat.1
===
--- usr.bin/procstat/procstat.1	(revision 195188)
+++ usr.bin/procstat/procstat.1	(working copy)
@@ -332,6 +332,8 @@
 copy-on-write
 .It N
 needs copy
+.It S
+One or more superpage mappings are used
 .El
 .Sh EXIT STATUS
 .Ex -std
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: large pages (amd64)

2009-06-29 Thread Mel Flynn

On Sunday 28 June 2009 15:41:49 Alan Cox wrote:
 Wojciech Puchar wrote:

  how can i check how much (or maybe - what processes) 2MB pages are
  actually allocated?

 I'm afraid that you can't with great precision.  For a given program
 execution, on an otherwise idle machine, you can only estimate the
 number by looking at the change in the quantity promotions + mappings -
 demotions before, during, and after the program execution.

 A program can call mincore(2) in order to determine if a virtual address
 is part of a 2 or 4MB virtual page.

Would it be possible to expose the super page count as kve_super in the 
kinfo_vmentry struct so that procstat can show this information? If only to 
determine if one is using the feature and possibly benefiting from it.

It looks like sys/kern/kern_proc.c could call mincore around the loop at line 
1601 (rev 194498), but I know nothing about the vm subsystem to know the 
implications or locking involved. There's still 16 bytes of spare to consume, 
in the kve_vminfo struct though ;)
-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: How best to debug locking/scheduler problems

2009-06-17 Thread Mel Flynn

On Wednesday 17 June 2009 04:15:26 John Baldwin wrote:
 On Tuesday 16 June 2009 7:01:45 pm Mel Flynn wrote:
  On Tuesday 16 June 2009 11:02:42 John Baldwin wrote:
   On Tuesday 16 June 2009 1:52:23 pm Mel Flynn wrote:
Hi John,
   
On Tuesday 16 June 2009 04:19:57 John Baldwin wrote:
 On Monday 15 June 2009 5:53:05 pm Mel Flynn wrote:
PIDTID COMM TDNAME   KSTACK
   4283 100215 kdeinit4 -mi_switch
  turnstile_wait _mtx_lock_sleep uipc_peeraddr kern_getpeername
  getpeername syscall Xint0x80_syscall
  % ps -ww 4283
PID  TT  STAT  TIME COMMAND
   4283  ??  T  0:00.38 kdeinit4: kdeinit4: kio_http http
  local:/tmp/ksocket-mel/klauncherxJ1635.slave-socket
  local:/tmp/ksocket- mel/plasmayC1653.slave-socket (kdeinit4)
 
  %ls -l /tmp/ksocket-mel/
 
  total 2
  -rw-rw-r--  1 mel  wheel  62 Jun 14 22:55 KSMserver__0
  srw---  1 mel  wheel   0 Jun 14 22:55 kdeinit4__0
  srwxrwxr-x  1 mel  wheel   0 Jun 14 22:55
  klauncherxJ1635.slave-socket

 You can use kgdb and the scripts at www.freebsd.org/~jhb/gdb. 
 Simply run 'kgdb' as root and do 'lcd /folder/with/scripts' and
 'source gdb6'. You can then do 'lockchain 4283' to find who holds
 the lock this thread is blocked on and what state they are in.
   
Looks like a deadlock:
   
(kgdb) lockchain 4283
 thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0
unp_mtx thread 100122 (pid 1635, klauncher) blocked on lock
0xc6806348 unp_mtx thread 100215 (pid 4283, kdeinit4) blocked on
lock 0xc64374a0 unp_mtx thread 100122 (pid 1635, klauncher) blocked
on lock 0xc6806348 unp_mtx thread 100215 (pid 4283, kdeinit4)
blocked on lock 0xc64374a0 unp_mtx thread 100122 (pid 1635,
klauncher) blocked on lock 0xc6806348 unp_mtx thread 100215 (pid
4283, kdeinit4) blocked on lock 0xc64374a0 unp_mtx thread 100122
(pid 1635, klauncher) blocked on lock 0xc6806348 unp_mtx thread
100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 unp_mtx
thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348
unp_mtx thread 100215 (pid 4283, kdeinit4) blocked on lock
0xc64374a0 unp_mtx thread 100122 (pid 1635, klauncher) blocked on
lock 0xc6806348 unp_mtx thread 100215 (pid 4283, kdeinit4) blocked
on lock 0xc64374a0 unp_mtx thread 100122 (pid 1635, klauncher)
blocked on lock 0xc6806348 unp_mtx thread 100215 (pid 4283,
kdeinit4) blocked on lock 0xc64374a0 unp_mtx thread 100122 (pid
1635, klauncher) blocked on lock 0xc6806348 unp_mtx thread 100215
(pid 4283, kdeinit4) blocked on lock 0xc64374a0 unp_mtx thread
100122 (pid 1635, klauncher) blocked on lock 0xc6806348 unp_mtx
thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0
unp_mtx thread 100122 (pid 1635, klauncher) blocked on lock
0xc6806348 unp_mtx DEADLOCK
   
Looking through the scripts now to see how I can get more info on the
call chain and hoping I don't panic the machine ;). It is quite
random to reproduce.
  
   In kgdb you can simply do 'tid 100122' followed by 'bt' and 'tid
   100215' followed by 'bt'.
 
  Cool, thanks for helping John. Of course it pretty much shows me what
  procstat -k shows and can't get any info on the userland part, but I can
  fully inspect the locks and threads.
 
  Both threads are in TDS_INHIBITED state, and blocked on:
  (kgdb) frame 0
  #0  sched_switch (td=0xc5971240, newtd=0xc4d39900, flags=259)
  at /usr/src/sys/kern/sched_ule.c:1864
  1864cpuid = PCPU_GET(cpuid);

 That doesn't really tell us anything except that it isn't running.  We know
 it is actually blocked on a lock, and we need the full stack trace to see
 where the two threads were trying to acquire the locks, hence 'bt'.  '
 procstat -k' output would be fine, too.

They're blocking on each other:
(kgdb) tid 100122
(kgdb) print td-td_contested.lh_first
$16 = (struct turnstile *) 0xc4f8fb00
(kgdb) print td-td_lock
$17 = (struct mtx * volatile) 0xc538cb00
(kgdb) tid 100215
(kgdb) print td-td_contested.lh_first
$18 = (struct turnstile *) 0xc538cb00
(kgdb) print td-td_lock
$19 = (struct mtx * volatile) 0xc4f8fb00

the respective bt's:
(kgdb) tid 100122
#0  sched_switch (td=0xc56e8900, newtd=0xc4d39b40, flags=259)
at /usr/src/sys/kern/sched_ule.c:1864
#1  0xc064bcfa in mi_switch (flags=259, newtd=0x0) at 
/usr/src/sys/kern/kern_synch.c:444
#2  0xc067d30b in turnstile_wait (ts=0xc538cb00, owner=0xc5971240, 
queue=Variable queue is not available.
)
at /usr/src/sys/kern/subr_turnstile.c:745
#3  0xc06346ac in _mtx_lock_sleep (m=0xc6806348, tid=3312355584, opts=0, 
file=0x0, line=0)
at /usr/src/sys/kern/kern_mutex.c:447
#4  0xc06a68a5 in uipc_peeraddr (so=0xc64309a8, nam=0xe79a2c70)
at /usr/src/sys/kern/uipc_usrreq.c:682
#5  0xc06a1e71 in kern_getpeername (td=0xc56e8900, fd=12, sa=0xe79a2c70, 
alen=0xe79a2c6c)
at /usr/src

Re: How best to debug locking/scheduler problems

2009-06-17 Thread Mel Flynn

On Wednesday 17 June 2009 13:17:37 John Baldwin wrote:
 On Wednesday 17 June 2009 3:52:54 pm Mel Flynn wrote:
  On Wednesday 17 June 2009 04:15:26 John Baldwin wrote:
   On Tuesday 16 June 2009 7:01:45 pm Mel Flynn wrote:
On Tuesday 16 June 2009 11:02:42 John Baldwin wrote:
 On Tuesday 16 June 2009 1:52:23 pm Mel Flynn wrote:
  Hi John,
 
  On Tuesday 16 June 2009 04:19:57 John Baldwin wrote:
   On Monday 15 June 2009 5:53:05 pm Mel Flynn wrote:
  PIDTID COMM TDNAME   KSTACK
 4283 100215 kdeinit4 -mi_switch
turnstile_wait _mtx_lock_sleep uipc_peeraddr kern_getpeername
getpeername syscall Xint0x80_syscall
% ps -ww 4283
  PID  TT  STAT  TIME COMMAND
 4283  ??  T  0:00.38 kdeinit4: kdeinit4: kio_http http
local:/tmp/ksocket-mel/klauncherxJ1635.slave-socket
local:/tmp/ksocket- mel/plasmayC1653.slave-socket (kdeinit4)
   
%ls -l /tmp/ksocket-mel/
   
total 2
-rw-rw-r--  1 mel  wheel  62 Jun 14 22:55 KSMserver__0
srw---  1 mel  wheel   0 Jun 14 22:55 kdeinit4__0
srwxrwxr-x  1 mel  wheel   0 Jun 14 22:55
klauncherxJ1635.slave-socket
  
   You can use kgdb and the scripts at www.freebsd.org/~jhb/gdb.
   Simply run 'kgdb' as root and do 'lcd /folder/with/scripts' and
   'source gdb6'. You can then do 'lockchain 4283' to find who
   holds the lock this thread is blocked on and what state they
   are in.
 
  Looks like a deadlock:
 
  (kgdb) lockchain 4283
   thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0
  unp_mtx thread 100122 (pid 1635, klauncher) blocked on lock
  0xc6806348 unp_mtx thread 100215 (pid 4283, kdeinit4) blocked
  on lock 0xc64374a0 unp_mtx thread 100122 (pid 1635, klauncher)

 blocked

  on lock 0xc6806348 unp_mtx thread 100215 (pid 4283, kdeinit4)
  blocked on lock 0xc64374a0 unp_mtx thread 100122 (pid 1635,
  klauncher) blocked on lock 0xc6806348 unp_mtx thread 100215
  (pid 4283, kdeinit4) blocked on lock 0xc64374a0 unp_mtx thread
  100122 (pid 1635, klauncher) blocked on lock 0xc6806348 unp_mtx
  thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0
  unp_mtx thread 100122 (pid 1635, klauncher) blocked on lock
  0xc6806348 unp_mtx thread 100215 (pid 4283, kdeinit4) blocked
  on lock 0xc64374a0 unp_mtx thread 100122 (pid 1635, klauncher)
  blocked on lock 0xc6806348 unp_mtx thread 100215 (pid 4283,
  kdeinit4) blocked on lock 0xc64374a0 unp_mtx thread 100122 (pid
  1635, klauncher) blocked on lock 0xc6806348 unp_mtx thread
  100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 unp_mtx
  thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348
  unp_mtx thread 100215 (pid 4283, kdeinit4) blocked on lock
  0xc64374a0 unp_mtx thread 100122 (pid 1635, klauncher) blocked
  on lock 0xc6806348 unp_mtx thread 100215 (pid 4283, kdeinit4)
  blocked on lock 0xc64374a0 unp_mtx thread 100122 (pid 1635,
  klauncher) blocked on lock 0xc6806348 unp_mtx DEADLOCK
 
  Looking through the scripts now to see how I can get more info on

 the

  call chain and hoping I don't panic the machine ;). It is quite
  random to reproduce.

 In kgdb you can simply do 'tid 100122' followed by 'bt' and 'tid
 100215' followed by 'bt'.
   
Cool, thanks for helping John. Of course it pretty much shows me what
procstat -k shows and can't get any info on the userland part, but I
can fully inspect the locks and threads.
   
Both threads are in TDS_INHIBITED state, and blocked on:
(kgdb) frame 0
#0  sched_switch (td=0xc5971240, newtd=0xc4d39900, flags=259)
at /usr/src/sys/kern/sched_ule.c:1864
1864cpuid = PCPU_GET(cpuid);
  
   That doesn't really tell us anything except that it isn't running.  We

 know

   it is actually blocked on a lock, and we need the full stack trace to
   see where the two threads were trying to acquire the locks, hence 'bt'.
' procstat -k' output would be fine, too.
 
  the respective bt's:
  (kgdb) tid 100122
  at /usr/src/sys/kern/kern_mutex.c:447
  #4  0xc06a68a5 in uipc_peeraddr (so=0xc64309a8, nam=0xe79a2c70)
  at /usr/src/sys/kern/uipc_usrreq.c:682
  #5  0xc06a1e71 in kern_getpeername (td=0xc56e8900, fd=12, sa=0xe79a2c70,
  alen=0xe79a2c6c)
  at /usr/src/sys/kern/uipc_syscalls.c:1566
 
  (kgdb) tid 100215
  (kgdb) bt
  at /usr/src/sys/kern/kern_mutex.c:447
  #4  0xc06a68a5 in uipc_peeraddr (so=0xc6976338, nam=0xe9ae9c70)
  at /usr/src/sys/kern/uipc_usrreq.c:682
  #5  0xc06a1e71 in kern_getpeername (td=0xc5971240, fd=7, sa=0xe9ae9c70,
  alen=0xe9ae9c6c)
  at /usr/src/sys/kern/uipc_syscalls.c:1566

 These are the key frames.  It looks like uipc_peeraddr() tries to lock two
 unp locks w/o any protection from the global unp

Re: How best to debug locking/scheduler problems

2009-06-16 Thread Mel Flynn

Hi John,

On Tuesday 16 June 2009 04:19:57 John Baldwin wrote:
 On Monday 15 June 2009 5:53:05 pm Mel Flynn wrote:

PIDTID COMM TDNAME   KSTACK
   4283 100215 kdeinit4 -mi_switch turnstile_wait
  _mtx_lock_sleep uipc_peeraddr kern_getpeername getpeername syscall
  Xint0x80_syscall
  % ps -ww 4283
PID  TT  STAT  TIME COMMAND
   4283  ??  T  0:00.38 kdeinit4: kdeinit4: kio_http http
  local:/tmp/ksocket-mel/klauncherxJ1635.slave-socket local:/tmp/ksocket-
  mel/plasmayC1653.slave-socket (kdeinit4)
 
  %ls -l /tmp/ksocket-mel/
 
  total 2
  -rw-rw-r--  1 mel  wheel  62 Jun 14 22:55 KSMserver__0
  srw---  1 mel  wheel   0 Jun 14 22:55 kdeinit4__0
  srwxrwxr-x  1 mel  wheel   0 Jun 14 22:55 klauncherxJ1635.slave-socket

 You can use kgdb and the scripts at www.freebsd.org/~jhb/gdb.  Simply
 run 'kgdb' as root and do 'lcd /folder/with/scripts' and 'source gdb6'. 
 You can then do 'lockchain 4283' to find who holds the lock this thread is
 blocked on and what state they are in.

Looks like a deadlock:

(kgdb) lockchain 4283
 thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 unp_mtx
 thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 unp_mtx
 thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 unp_mtx
 thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 unp_mtx
 thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 unp_mtx
 thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 unp_mtx
 thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 unp_mtx
 thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 unp_mtx
 thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 unp_mtx
 thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 unp_mtx
 thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 unp_mtx
 thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 unp_mtx
 thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 unp_mtx
 thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 unp_mtx
 thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 unp_mtx
 thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 unp_mtx
 thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 unp_mtx
 thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 unp_mtx
 thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 unp_mtx
 thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 unp_mtx
 DEADLOCK

Looking through the scripts now to see how I can get more info on the call 
chain and hoping I don't panic the machine ;). It is quite random to 
reproduce.
-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: How best to debug locking/scheduler problems

2009-06-16 Thread Mel Flynn

On Tuesday 16 June 2009 11:02:42 John Baldwin wrote:
 On Tuesday 16 June 2009 1:52:23 pm Mel Flynn wrote:
  Hi John,
 
  On Tuesday 16 June 2009 04:19:57 John Baldwin wrote:
   On Monday 15 June 2009 5:53:05 pm Mel Flynn wrote:
  PIDTID COMM TDNAME   KSTACK
 4283 100215 kdeinit4 -mi_switch
turnstile_wait _mtx_lock_sleep uipc_peeraddr kern_getpeername
getpeername syscall Xint0x80_syscall
% ps -ww 4283
  PID  TT  STAT  TIME COMMAND
 4283  ??  T  0:00.38 kdeinit4: kdeinit4: kio_http http
local:/tmp/ksocket-mel/klauncherxJ1635.slave-socket
local:/tmp/ksocket- mel/plasmayC1653.slave-socket (kdeinit4)
   
%ls -l /tmp/ksocket-mel/
   
total 2
-rw-rw-r--  1 mel  wheel  62 Jun 14 22:55 KSMserver__0
srw---  1 mel  wheel   0 Jun 14 22:55 kdeinit4__0
srwxrwxr-x  1 mel  wheel   0 Jun 14 22:55
klauncherxJ1635.slave-socket
  
   You can use kgdb and the scripts at www.freebsd.org/~jhb/gdb.  Simply
   run 'kgdb' as root and do 'lcd /folder/with/scripts' and 'source gdb6'.
   You can then do 'lockchain 4283' to find who holds the lock this thread
   is blocked on and what state they are in.
 
  Looks like a deadlock:
 
  (kgdb) lockchain 4283
   thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 unp_mtx
   thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 unp_mtx
   thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 unp_mtx
   thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 unp_mtx
   thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 unp_mtx
   thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 unp_mtx
   thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 unp_mtx
   thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 unp_mtx
   thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 unp_mtx
   thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 unp_mtx
   thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 unp_mtx
   thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 unp_mtx
   thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 unp_mtx
   thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 unp_mtx
   thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 unp_mtx
   thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 unp_mtx
   thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 unp_mtx
   thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 unp_mtx
   thread 100215 (pid 4283, kdeinit4) blocked on lock 0xc64374a0 unp_mtx
   thread 100122 (pid 1635, klauncher) blocked on lock 0xc6806348 unp_mtx
   DEADLOCK
 
  Looking through the scripts now to see how I can get more info on the
  call chain and hoping I don't panic the machine ;). It is quite random to
  reproduce.

 In kgdb you can simply do 'tid 100122' followed by 'bt' and 'tid 100215'
 followed by 'bt'.

Cool, thanks for helping John. Of course it pretty much shows me what procstat 
-k shows and can't get any info on the userland part, but I can fully inspect 
the locks and threads.

Both threads are in TDS_INHIBITED state, and blocked on:
(kgdb) frame 0
#0  sched_switch (td=0xc5971240, newtd=0xc4d39900, flags=259)
at /usr/src/sys/kern/sched_ule.c:1864
1864cpuid = PCPU_GET(cpuid);

print newtd-td_name
$9 = idle: cpu0\000\000\000\000\000\000\000\000\000

Is there anything you want to see to shed some light on why these threads 
might be deadlocked?

This is a 8-current kernel, seen this issue for a while (March for sure) but 
the running one is FreeBSD 8.0-CURRENT #2 r194183: Sun Jun 14 15:09:27 AKDT 
2009. Not a GENERIC, basically stripped I[45]86_CPU, SCTP, unused hardware, no 
PRINTF_BUFFER_SIZE, added wpi, ichsmb and smbus, mmc, mmcsd and sdhci, HWPMC. 
Config inlined below sig.


-- 
Mel

#
# GENERIC -- Generic kernel configuration file for FreeBSD/i386
#
# For more information on this file, please read the config(5) manual page,
# and/or the handbook section on Kernel Configuration Files:
#
#http://www.FreeBSD.org/doc/en_US.ISO8859-1/books/handbook/kernelconfig-
config.html
#
# The handbook is also available locally in /usr/share/doc/handbook
# if you've installed the doc distribution, otherwise always see the
# FreeBSD World Wide Web server (http://www.FreeBSD.org/) for the
# latest information.
#
# An exhaustive list of options and more detailed explanations of the
# device lines is also present in the ../../conf/NOTES and NOTES files.
# If you are in doubt as to the purpose or necessity of a line, check first
# in NOTES.
#
# Used:
# $FreeBSD: src/sys/i386/conf/GENERIC,v 1.511 2009/03/19 20:33:26 thompsa Exp 
$
# $FreeBSD: src/sys/conf/NOTES,v 1.1534 2009/04/15 22:38:22 marcel Exp $
#
# This file:
# $Coar: kernels/8.x/i386/SMOOCHIES,v 1.1 2009/04/17 11:50:10 mel Exp $

cpu I686_CPU
ident   SMOOCHIES

How best to debug locking/scheduler problems

2009-06-15 Thread Mel Flynn

Hi,

I'm trying to get to the bottom of a bug with getpeername() and certain kde4 
applications which is probably as low-level as the libthr and the scheduler.

From browsing various related files in sys/kern it seems KTR is a good bet to 
get the information needed, yet it isn't really well supported in userland. 
For one, I've got no clue other then logging console output(?) how to retrieve 
the lock info or filter it in userland from reading ktr(9) and alq(9). Gdb is 
useless as the process doesn't give the information gdb wants and gdb just 
hangs in wait. ktrace also does not provide anything as there are no more 
syscalls being made, so I'll have to get to the bottom of this by tracing and 
filtering.

Short description of the problem:
a process never gets out of mi_switch and remains locked even init tries to 
shut it down.

% procstat -t 4283

  PIDTID COMM TDNAME   CPU  PRI STATE   WCHAN
 4283 100215 kdeinit4 -  0  128 lock*unp_mtx  
% procstat -k 4283

  PIDTID COMM TDNAME   KSTACK   
 4283 100215 kdeinit4 -mi_switch turnstile_wait 
_mtx_lock_sleep uipc_peeraddr kern_getpeername getpeername syscall 
Xint0x80_syscall 
% ps -ww 4283
  PID  TT  STAT  TIME COMMAND
 4283  ??  T  0:00.38 kdeinit4: kdeinit4: kio_http http 
local:/tmp/ksocket-mel/klauncherxJ1635.slave-socket local:/tmp/ksocket-
mel/plasmayC1653.slave-socket (kdeinit4)

%ls -l /tmp/ksocket-mel/

total 2
-rw-rw-r--  1 mel  wheel  62 Jun 14 22:55 KSMserver__0
srw---  1 mel  wheel   0 Jun 14 22:55 kdeinit4__0
srwxrwxr-x  1 mel  wheel   0 Jun 14 22:55 klauncherxJ1635.slave-socket

-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: FYI Lighttpd 1.4.23 /kernel (trailing '/' on regular file symlink) vulnerability

2009-05-28 Thread Mel Flynn

On Tuesday 26 May 2009 23:20:01 Dag-Erling Smørgrav wrote:
 Dag-Erling Smørgrav d...@des.no writes:
  Like bde@ pointed out, the patch is incorrect.  It moves the test for
  v_type != VDIR up to a point where, in the case of a symlink, v_type is
  always (by definition) VLNK.

 Hmm, actually, symlinks are resolved in namei(), not lookup().  This is
 not going to be pretty.  I'll be back later...

I don't pretend to comprehend the kernel side of things fully, but wouldn't it 
be easier to append a dot to all trailing slashes inside or before passing to 
namei? This works in userland at present and lighttpd could use something 
similar as a work around until it's fixed:
% echo this is foo  foo

% ln -fs foo bar

% cat bar/
this is foo

% cat bar/.
cat: bar/.: Not a directory

-- 
Mel
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

38 matches

Mail list logo