Re: tcsh loses the foreground process group?

2008-12-03 Thread Steve Watt
On Wed, Dec 03, 2008 at 03:58:36PM -0800, Nate Eldredge wrote:
> On Tue, 2 Dec 2008, Steve Watt wrote:
> 
> >In <[EMAIL PROTECTED]> Steve Watt wrote:

[ tcsh 6.15.00 ]

> >>The symptom is that when I do a long-ish running task inside a `` 
> >>expansion
> >>that I then ^C, nobody gets the foreground process group...  I never get
> >>a prompt back after the ^C, and ^T gets me
> >>
> >> load: 0.27 no foreground process group
> >
> >I've got another FreeBSD machine available that was running tcsh
> >6.14.00, and it does _NOT_ display the problem.  When I build
> >6.15.00 on that same box (/usr/src is more up to date than the
> >install right now), that does fail.
>
> Thanks for the report.  It looks like this is yet another manifestation of 
> a problem in tcsh, where it does inappropriate things in a vfork'ed 
> subshell.  In my tests, running tcsh with -F (which causes it to use fork 
> instead of vfork) causes the problem to go away.  It is also present in 
> 7.0-RELEASE and probably all later versions.

Did the behavior change between 6.14.00 and 6.15.00?  (Yeah, OK, I can
go look myself.)

OK, I went and looked.  Answer:  Yep, lots of additions of inappropriate
things in backeval().  But it no longer has a 10K limit.

> There are several open bugs related to this problem, but so far they do 
> not seem to have attracted the interest of any committers.  Among them 
> are:
> 
> bin/41297
> bin/52746
> bin/125185
> amd64/128259
> bin/129378 (which you just opened)
> 
> The fix is simple: make -F the default.  There is a minor performance 
> penalty, but that's a small price to pay for correct behavior.  A more 
> involved fix would be to make tcsh not do inappropriate things after vfork 
> (modifying global variables), or at least clean up before exiting, but 
> IMHO that is less clean; vfork really shouldn't be used here at all.

Actually, there's another cost to making -F default:  It makes hashstat
rather less useful.  OK, it's not like it's that useful to begin with,
and is arguably a debugging function, but it's a side effect.

As for a possible "why?" -F changes the hashstat behavior so?  Probably
because it's counting on not-quite-legal vfork() activity.

Ugh.  I'd managed to forget how unfun the code inside that shell is.  I'll
try to forget again.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: FreeBSD 7.0-RELEASE amd64 on Dell M600 Blade

2008-12-03 Thread Adam Jacob Muller


On Sep 12, 2008, at 2:03 AM, Karl Fischer wrote:

On Fri, Sep 12, 2008 at 01:00, Steven Hartland <[EMAIL PROTECTED] 
> wrote:

Thanks Rudi, would really like to get is sorted as they would make
ideal app servers.

- Original Message - From: "Rudi Kramer - MWEB" <[EMAIL PROTECTED] 
>



Hi Steven,

We recently purchased a few M600's but haven't got around to loading
FBSD on them, we should start installing next week and I will let you
know if we run in to any problems.


I have the same problem on my M600 Blades has anyone tested the 7.1  
Beta and

I'm about to purchase more of them.

Karl




Anyone ever get this to work? Perhaps this was fixed in a newer  
FreeBSD? Have some M600 that i'd like to get FreeBSD running on :)


hint.apic.0.disabled seemed to change things a bit, it would reach  
loading mfs root but then lock hard again anyway :/



-Adam

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: tcsh loses the foreground process group?

2008-12-03 Thread Nate Eldredge

On Wed, 3 Dec 2008, Nate Eldredge wrote:

Thanks for the report.  It looks like this is yet another manifestation of a 
problem in tcsh, where it does inappropriate things in a vfork'ed subshell. 
In my tests, running tcsh with -F (which causes it to use fork instead of 
vfork) causes the problem to go away.  It is also present in 7.0-RELEASE and 
probably all later versions.


There are several open bugs related to this problem, but so far they do not 
seem to have attracted the interest of any committers.  Among them are:


bin/41297
bin/52746
bin/125185
amd64/128259
bin/129378 (which you just opened)


I have opened bin/129405 as an omnibus PR for these problems.

--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: tcsh loses the foreground process group?

2008-12-03 Thread Nate Eldredge

On Tue, 2 Dec 2008, Steve Watt wrote:


In article <[EMAIL PROTECTED]> you write:
[ ... ]

I'm running 6-STABLE (6.4-PRE as of 24 Nov right now), tcsh 6.15.00, which
shows

 tcsh 6.15.00 (Astron) 2007-03-03 (i386-intel-FreeBSD) options 
wide,nls,dl,al,kan,sm,rh,color,filec

as $version.

The symptom is that when I do a long-ish running task inside a `` expansion
that I then ^C, nobody gets the foreground process group...  I never get
a prompt back after the ^C, and ^T gets me

 load: 0.27 no foreground process group


[ ... ]


One portable reproduction:
# cd /usr/src
# less `egrep -lir '^Foo.*baz' *`
^Cload: 0.02 no foreground process group

(I typed ^C ^T)

SIGKILL to the shell seems to be the only way to get things back to
normal.


I've gotten one "me too", which indicated that SIGHUP to the shell
will also make it go away, but does not solve the problem.

I've got another FreeBSD machine available that was running tcsh
6.14.00, and it does _NOT_ display the problem.  When I build
6.15.00 on that same box (/usr/src is more up to date than the
install right now), that does fail.

Thus I'm pretty comfortable saying that it's a tcsh bug of some
sort, and probably a regression.  Hopefully this can be fixed
(PR being filed now) before 6.4 releases...


Thanks for the report.  It looks like this is yet another manifestation of 
a problem in tcsh, where it does inappropriate things in a vfork'ed 
subshell.  In my tests, running tcsh with -F (which causes it to use fork 
instead of vfork) causes the problem to go away.  It is also present in 
7.0-RELEASE and probably all later versions.


There are several open bugs related to this problem, but so far they do 
not seem to have attracted the interest of any committers.  Among them 
are:


bin/41297
bin/52746
bin/125185
amd64/128259
bin/129378 (which you just opened)

The fix is simple: make -F the default.  There is a minor performance 
penalty, but that's a small price to pay for correct behavior.  A more 
involved fix would be to make tcsh not do inappropriate things after vfork 
(modifying global variables), or at least clean up before exiting, but 
IMHO that is less clean; vfork really shouldn't be used here at all.


--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: TCSBRK not implemented in linux compat

2008-12-03 Thread Arjan van der Velde
Hi, thanks. I think for now I can work around this although it would  
be nice to have this implemented.


Regards,

Arjan

On Dec 2, 2008, at 11:13 PM, Ed Schouten wrote:


Hello Arjan,

* Arjan van der Velde <[EMAIL PROTECTED]> wrote:
While trying to get a linux binary running on FreeBSD I encountered  
the

following problem during serial port I/O.

Dec  1 22:22:34 soekris kernel: linux: pid 7239 (linuxbinary): ioctl
fd=0, cmd=0x5409 ('T',9) is not implemented

0x5409  turns out to be TCSBRK, which is not implemented (yet?). Can
anyone give me some clues where / how to start implementing this? It
seems like the linux way of handling it is to call tcdrain(), but I'm
not sure as to how this translates to the FreeBSD compat layer.


I think you could just make it call TIOCDRAIN directly. Unfortunately
that's not correct if the argument is 0, because then we have to call
TIOCSBRK and TIOCCBRK with a 250 msec interval. I guess adding some  
kind

of printf() there should be good enough for now.

I can't look into it right now, because I have to get up at 6:15
tomorrow. Sorry! :-/

--
Ed Schouten <[EMAIL PROTECTED]>
WWW: http://80386.nl/


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


vm_map_entry for kernel virtual addres

2008-12-03 Thread Alexej Sokolov
Hello,
If I allocate memory from a kernel module:
MALLOC(addr, vm_offset_t, PAGE_SIZE, M_DEVBUF, M_WAITOK | M_ZERO);

how can I get a pointer to vm_map_entry structure which describes the memory
region where "addr" is ?

Thanks,
Alexey
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: NFS (& amd?) dysfunction descending a hierarchy

2008-12-03 Thread Kostik Belousov
On Tue, Dec 02, 2008 at 04:15:38PM -0800, David Wolfskill wrote:
> I seem to have a fairly- (though not deterministly so) reproducible
> mode of failure with an NFS-mounted directory hierarchy:  An attempt to
> traverse a "sufficiently large" hierarchy (e.g., via "tar zcpf" or "rm
> -fr") will fail to "visit" some subdirectories, typically apparently
> acting as if the subdirectories in question do not actually exist
> (despite the names having been returned in the output of a previous
> readdir()).
> 
> The file system is mounted read-write, courtesy of amd(8); none of
> the files has any non-default flags; there are no ACLs involved;
> and I owned the lot (that is, as "owning user" of the files).
> 
> An example of "sufficiently large" has been demonstrated to be a recent
> copy of a FreeBSD ports tree.  (The problem was discovered using a
> hierarchy that had some proprietary content; I tried a copy of the ports
> tree to see if I could replicate the issue with something a FreeBSD
> hacker would more likely have handy.  And avoid NDA issues.  :-})
> 
> Now, before I go further: I'm not pointing the finger at FreeBSD,
> here (yet).  At minimum, there could be fault with FreeBSD (as the NFS
> client); with amd(8); with the NetApp Filer (as the NFS server);
> or the network -- or the configuration(s) of any of them.
> 
> But I just tried this, using the same NFS server, but a machine running
> Solaris 8 as an NFS client, and was unable to re-create the problem.
> 
> And I found a way to avoid having the problem occur using a FreeBSD NFS
> client:  whack amd(8)'s config so that the dismount_interval is 12 hours
> instead of the default 2 minutes, thus effectivly preventing amd(8) from
> its normal attempts to unmount file systems.  Please note that I don't
> consider this a fix -- or even an acceptable circumvention, in the long
> term.  Rather, it's a diagnostic change, in an attempt to better
> understand the nature of the problem.
> 
> Here are step-by-step instructions to recreate the problem;
> unfortunately, I believe I don't have the resources to test this
> anywhere but at work, though I will try it at home, to the extent
> that I can:
> 
> * Set up the environment.
>   * The failing environment uses NetApp filers as NFS servers.  I don't
> know what kind or how recent the software is on them, but can
> find out.  (I exepct they're fairly well-maintained.)
>   * Ensure that the NFS space available is at least 10 GB or more.
> I will refer to this as "~/NFS/", as I tend to create such symlinks
> to keep track of things.
>   * I used a dual, quad-core machine running FreeBSD RELENG_7_1 as of
> yesterday morning as an NFS client.  It also had a recently-updated
> /usr/ports tree, which was a CVS working directory (so each "real"
> subdirectory also had a CVS subdirectory within it).
>   * Set up amd(8) so that ~/NFS is mounted on demand when it's
> referenced, and only via amd(8).  Ensure that the dismount_interval
> has the default value of 120 seconds.
> * Create a reference tarball.
>   * cd /usr && tar zcpf ~/NFS/ports.tgz ports/
> * Create the test directory hierarchy.
>   * cd ~/NFS && tar zxpf ports.tgz
> * Clear any cache.
>   * Unmount ~/NFS, then re-mount it.  Or just reboot the NFS client
> machine.  Or arrange to have done all of the above set-up stuff
> from a differnet NFS client.
> * Set up for information capture (optional).
>   * Use ps(1) or your favorite alternative tool to determine the PID for
> amd(8).  Note that `cat /var/run/amd.pid` won't do the trick.  :-{
>   * Run ktrace(1) to capture activity from amd(8) and its descendants,
> e.g.:
> 
>   sudo ktrace -dip ${amd_pid} -f ktrace_amd.out
> 
>   * Start a packet-capture for NFS traffic, e.g.:
> 
>   sudo tcpdump -s 0 -n -w nfs.bpf host ${nfs_server}
> 
> * Start the test.
>   * Do this under ktrace(1), if you did the above optional step:
> 
>   rm -fr ~/NFS/ports; echo $?
> 
> As soon as rm(1) issues a whine, you might as well interrupt it
> (^C).
> 
> * Stop the information capture, if you started it.
>   * ^C for the tcpdump(1) process.
>   * sudo ktrace -C
> 
> 
> If the packet capture file is too big for the analysis program you
> prefer to digest as a unit, see the net/tcpslice port for a bit of
> relief.  (Wireshark seems to want to read an entire packet capture file
> into main memory.)
> 
> I have performed the above, with the "information-gathering" step; I can
> *probably* make that information available, but I'll need to check --
> some organizations get paranoid about things like host names.  I don't
> expect that my current employer is, but I don't know yet, so I won't
> promise.
> 
> In the mean time, I should be able to extract somewhat-relevant
> information from what I've collected, if that would be useful.  While I
> wouldn't mind sharing the results, I strongly suspect that blow-by-blow
> analysis wouldn't be ideal for this (or any other)

Re: NFS (& amd?) dysfunction descending a hierarchy

2008-12-03 Thread Danny Braniss
> 
> --vmttodhTwj0NAgWp
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
> Content-Transfer-Encoding: quoted-printable
> 
> On Wed, Dec 03, 2008 at 02:20:32PM +0200, Danny Braniss wrote:
> > ...
> > i'll try to check it here soon, but in the meantime, could you try the sa=
> me
> > but mounting directly, not via amd, to remove one item from the equation?
> > (I don't know how much amd is involved here, but if you are running on a
> > 64bit host, amd could be swapped out, in which case it tends to realy scr=
> ew
> > things up, which is not your case, but ...)
> 
> Sorry; I should have mentioned that the NFS client was running
> RELENG_7_1 as of Monday morning, i386 arch.  The amd.conf file specifies
> "plock" for amd(8).
> 
> Note that merely telling amd(8) to kick the interval of attempted
> unmounts from 2 minutes to 12 hours appears to avoid the observed
> symptoms, so I'm fairly confident that bypassing amd(8) altogether would
> do so as well.
> 
> In looking at the output from ktrace against amd(8), I recall having
> seen that shortly before an observed failure, the (master) amd
> process forks a child to attempt the unmount; the child issues an
> unmount, the return for which is EBUSY (IIRC -- I'm not in a good
> position to check just at the moment), so the child terminates with an
> "interrupted system call".
> 
> I'd have thought that since the attempted unmount failed, it wouldn't
> make any difference, but it's right around that point that rm(1) is told
> that a directory entry it found earlier doesn't exist, which rather
> snowballs into the previously-described symptoms.

so it does point to amd - or something inocent it does - which triggers the 
error.
btw, there are some patches (5 I think), that try to fix some of amd problems.
I've installed them, and things are quiet/ok -most of the time- but I get a
glitch once in a while. would love to iron them out though.

cheers,
danny


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: NFS (& amd?) dysfunction descending a hierarchy

2008-12-03 Thread David Wolfskill
On Wed, Dec 03, 2008 at 02:20:32PM +0200, Danny Braniss wrote:
> ...
> i'll try to check it here soon, but in the meantime, could you try the same
> but mounting directly, not via amd, to remove one item from the equation?
> (I don't know how much amd is involved here, but if you are running on a
> 64bit host, amd could be swapped out, in which case it tends to realy screw
> things up, which is not your case, but ...)

Sorry; I should have mentioned that the NFS client was running
RELENG_7_1 as of Monday morning, i386 arch.  The amd.conf file specifies
"plock" for amd(8).

Note that merely telling amd(8) to kick the interval of attempted
unmounts from 2 minutes to 12 hours appears to avoid the observed
symptoms, so I'm fairly confident that bypassing amd(8) altogether would
do so as well.

In looking at the output from ktrace against amd(8), I recall having
seen that shortly before an observed failure, the (master) amd
process forks a child to attempt the unmount; the child issues an
unmount, the return for which is EBUSY (IIRC -- I'm not in a good
position to check just at the moment), so the child terminates with an
"interrupted system call".

I'd have thought that since the attempted unmount failed, it wouldn't
make any difference, but it's right around that point that rm(1) is told
that a directory entry it found earlier doesn't exist, which rather
snowballs into the previously-described symptoms.

Peace,
david
-- 
David H. Wolfskill  [EMAIL PROTECTED]
Depriving a girl or boy of an opportunity for education is evil.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


pgpp431JKqC0x.pgp
Description: PGP signature


Re: NFS (& amd?) dysfunction descending a hierarchy

2008-12-03 Thread Danny Braniss
> 
> --hYooF8G/hrfVAmum
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
> Content-Transfer-Encoding: quoted-printable
> 
> I seem to have a fairly- (though not deterministly so) reproducible
> mode of failure with an NFS-mounted directory hierarchy:  An attempt to
> traverse a "sufficiently large" hierarchy (e.g., via "tar zcpf" or "rm
> -fr") will fail to "visit" some subdirectories, typically apparently
> acting as if the subdirectories in question do not actually exist
> (despite the names having been returned in the output of a previous
> readdir()).
> 
> The file system is mounted read-write, courtesy of amd(8); none of
> the files has any non-default flags; there are no ACLs involved;
> and I owned the lot (that is, as "owning user" of the files).
> 
> An example of "sufficiently large" has been demonstrated to be a recent
> copy of a FreeBSD ports tree.  (The problem was discovered using a
> hierarchy that had some proprietary content; I tried a copy of the ports
> tree to see if I could replicate the issue with something a FreeBSD
> hacker would more likely have handy.  And avoid NDA issues.  :-})
> 
> Now, before I go further: I'm not pointing the finger at FreeBSD,
> here (yet).  At minimum, there could be fault with FreeBSD (as the NFS
> client); with amd(8); with the NetApp Filer (as the NFS server);
> or the network -- or the configuration(s) of any of them.
> 
> But I just tried this, using the same NFS server, but a machine running
> Solaris 8 as an NFS client, and was unable to re-create the problem.
> 
> And I found a way to avoid having the problem occur using a FreeBSD NFS
> client:  whack amd(8)'s config so that the dismount_interval is 12 hours
> instead of the default 2 minutes, thus effectivly preventing amd(8) from
> its normal attempts to unmount file systems.  Please note that I don't
> consider this a fix -- or even an acceptable circumvention, in the long
> term.  Rather, it's a diagnostic change, in an attempt to better
> understand the nature of the problem.
> 
> Here are step-by-step instructions to recreate the problem;
> unfortunately, I believe I don't have the resources to test this
> anywhere but at work, though I will try it at home, to the extent
> that I can:
> 
> * Set up the environment.
>   * The failing environment uses NetApp filers as NFS servers.  I don't
> know what kind or how recent the software is on them, but can
> find out.  (I exepct they're fairly well-maintained.)
>   * Ensure that the NFS space available is at least 10 GB or more.
> I will refer to this as "~/NFS/", as I tend to create such symlinks
> to keep track of things.
>   * I used a dual, quad-core machine running FreeBSD RELENG_7_1 as of
> yesterday morning as an NFS client.  It also had a recently-updated
> /usr/ports tree, which was a CVS working directory (so each "real"
> subdirectory also had a CVS subdirectory within it).
>   * Set up amd(8) so that ~/NFS is mounted on demand when it's
> referenced, and only via amd(8).  Ensure that the dismount_interval
> has the default value of 120 seconds.
> * Create a reference tarball.
>   * cd /usr && tar zcpf ~/NFS/ports.tgz ports/
> * Create the test directory hierarchy.
>   * cd ~/NFS && tar zxpf ports.tgz
> * Clear any cache.
>   * Unmount ~/NFS, then re-mount it.  Or just reboot the NFS client
> machine.  Or arrange to have done all of the above set-up stuff
> from a differnet NFS client.
> * Set up for information capture (optional).
>   * Use ps(1) or your favorite alternative tool to determine the PID for
> amd(8).  Note that `cat /var/run/amd.pid` won't do the trick.  :-{
>   * Run ktrace(1) to capture activity from amd(8) and its descendants,
> e.g.:
> 
>   sudo ktrace -dip ${amd_pid} -f ktrace_amd.out
> 
>   * Start a packet-capture for NFS traffic, e.g.:
> 
>   sudo tcpdump -s 0 -n -w nfs.bpf host ${nfs_server}
> 
> * Start the test.
>   * Do this under ktrace(1), if you did the above optional step:
> 
>   rm -fr ~/NFS/ports; echo $?
> 
> As soon as rm(1) issues a whine, you might as well interrupt it
> (^C).
> 
> * Stop the information capture, if you started it.
>   * ^C for the tcpdump(1) process.
>   * sudo ktrace -C
> 
> 
> If the packet capture file is too big for the analysis program you
> prefer to digest as a unit, see the net/tcpslice port for a bit of
> relief.  (Wireshark seems to want to read an entire packet capture file
> into main memory.)
> 
> I have performed the above, with the "information-gathering" step; I can
> *probably* make that information available, but I'll need to check --
> some organizations get paranoid about things like host names.  I don't
> expect that my current employer is, but I don't know yet, so I won't
> promise.
> 
> In the mean time, I should be able to extract somewhat-relevant
> information from what I've collected, if that would be useful.  While I
> wouldn't mind sharing the results, I stro