`dump` and/or `restore` incorrectly handles /dev files

2002-09-03 Thread Patrick Thomas


Try this - it's good for a laugh:

ls -asl dev/*mem

 0 crw-r-   1 root  kmem2,   1 Aug 27 15:16 kmem
 0 crw-r-   1 root  kmem2,   0 Aug 27 15:16 mem

Now run this command, changing some permissions:

chmod -w dev/mem ; chmod -w dev/kmem

Now, dump that filesystem that your /dev resides on with:

`dump -0a -f /some/file /dev/ad0a`

Now, restore your dump file (/some/file) with:

`restore -x -f /some/file`

(I just restored into some arbitrary directory) (answered 1 for which
volume to start with, and answered y to the trailing set owner/mode for
. question)

Now, once again, ls -asl dev/*mem

 0 crw---   1 root  wheel   2,   1 Sep  3 01:13 kmem
 0 crw---   1 root  wheel   2,   0 Sep  3 01:13 mem

---

Gee, that's funny - not only are they _not_ -w as they were changed to
before dumping, but they've also lost a r- as well !

Easily reproducible.  Don't respond to this thread if all you have to say
is well you shouldn't be chmodding those files -w anyway.

--pt


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: setting quotas _inside_ a jail for users _inside_ a jail

2002-09-01 Thread Patrick Thomas


No, sorry I think that I was misunderstood - here is my situation:

- I have a host machine with no users - just root.
- on that host machine I have a vn-backed FS 500 megs in size
- on that vn-backed FS, I run a jail - and no other jails share that
vn-backed FS (although other jails may share the underlying actual disk FS
that the vn is on...)

Now, I die in a car accident and nobody ever logs into the host system
again or touches anything on the _host system_.

Can the root user of the _jail running on the host system_ set up quotas
for her users ?  Let's assume the root user and all her other users don't
even know it is a jail - as far as they are concerned, it's just their
freebsd machine.

So the question is, can this root user set up quotas ?  And if so, some
hints on exactly what needs to go into /etc/fstab _inside their jail_,
since specifying anything in there seems to have the side effects of:

a) not working as expected
b) causing the jail not to be startable.

thanks,

PT

On Sun, 1 Sep 2002, Robert Watson wrote:


 On Fri, 30 Aug 2002, Patrick Thomas wrote:

  I realize the difficulties in trying to use quotas on the _host_
  system to limit the size of jails on the host system - userid mapping,
  etc.  This is not what I am asking.
 
  I wonder, is it possible for the root user of a jail to set quotas
  _inside_ her jail for users _inside_ her jail ?  Can anyone simply
  confirm or deny that this is possible ?
 
  Simply following normal protocol does not work, because if you place
  filesystem entries into /etc/fstab inside the jail, the jail will no
  longer start, as it does not have permission to mount or otherwise
  manipulate those filesystems.

 Other than the access control checks in the quota code being influenced by
 the jail, there really is no relationship between jails and quotas.
 Jails are solely a property of processes and other credential-bearing
 kernel objects.  Persistent and transient quota information is stored
 relative to uids and gids, and quotas are enforced based on those elements
 of the process credential, and are not impacted by the jail field.  This
 means that if a file system is shared by two jails, and a particular uid
 is in use in both jails, both sets of processes will be impacted by the
 same quota.

 Privileged users can perform quota management calls on any file system
 they can name via a visible file object.  If quota management calls were
 permitted from jail, they could likewise be performed on any file system
 visible in the jail.  If only appropriate file systems are visible from
 the jail, you could add PRISON_ROOT to the flags field of the relevant
 suser call.  If you expose file systems to the jail that you don't want
 the root user in the jail to set quotas on, you may be out of luck.  I
 take it from your description that you're interested in imposing quotas on
 the users in the jail, not quotas on the jail itself?

 Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
 [EMAIL PROTECTED]  Network Associates Laboratories




To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



setting quotas _inside_ a jail for users _inside_ a jail

2002-08-30 Thread Patrick Thomas


Hello,

I realize the difficulties in trying to use quotas on the _host_ system to
limit the size of jails on the host system - userid mapping, etc.  This is
not what I am asking.

I wonder, is it possible for the root user of a jail to set quotas
_inside_ her jail for users _inside_ her jail ?  Can anyone simply confirm
or deny that this is possible ?

Simply following normal protocol does not work, because if you place
filesystem entries into /etc/fstab inside the jail, the jail will no
longer start, as it does not have permission to mount or otherwise
manipulate those filesystems.

Comments ?  Thoughts ?  Confirmations or denials ?

thnaks!


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: top shows all zeroes.

2002-08-29 Thread Patrick Thomas


Ok, this seems to have died down a bit, and my own urgency has passed
since it is no longer manifesting itself on my test machinehowever,
two things come to mind:

1. is it possible that arbitrary top output is now suspect on machines
that have manifested this behavior ?  I am not showing all zeros anymore,
but who is to say that what I am seeing is correct ?  My vmstat -i now
yields:

rtc irq8 29272122 66

and I am seeing a rate of 128 on normal systems.  So maybe my top output
is still wrong, even though it isn't all zeros.

2. What is to be done ?  I have no reason to believe this won't crop up on
4.6.2 or later...does anyone else ?

thanks.  pat.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: top shows all zeroes.

2002-08-26 Thread Patrick Thomas


ok:

# vmstat -i
interrupt   total   rate
ata0 irq14 23  0
ahc0 irq10 15  0
aac0 irq2 6330470 30
fxp0 irq517556113 83
fdc0 irq6   4  0
sio0 irq4   8  0
sio1 irq3   8  0
clk irq0 21008332 99
rtc irq8   264460  1
Total45159433214

Now, when I repeat vmstat -i, all of these numbers (or rather, all of the
large numbers) increase _except_ for `rtc irq8`.

So is this just a simple broken clock on the system, as in, my hardware
clock is physically broken/breaking ?

dmesg says nothing about irq8, so I assume there is no conflict.

Further, regarding the APM conjecture, this is a server and (although I
may be mistaken) does not have APM in the bios at all - I have also
removed it from the kernel.  dmesg tends to confirm the absence of APM.

--bpat

On Mon, 26 Aug 2002, David Malone wrote:

 On Sun, Aug 25, 2002 at 04:49:23PM -0700, Patrick Thomas wrote:
  Also, just to add a bit more info, sometimes instead of rebooting to solve
  the problem, the problem doesn't exist, and rebooting causes it to
  manifest.  So it seems fairly random.

 Can you watch vmstat -i before and after the problem occurs? I'm
 guessing that one of the interrupt counts will stop increasing.

   David.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: top shows all zeroes.

2002-08-26 Thread Patrick Thomas


ok, after 2+ days, for no discernible reason I now have real top stats
back.

This has occurred within the last 20 minutes, and I have done nothing at
all on the system save normal operation.  vmstat -i now tells me:

# vmstat -i
...
rtc irq8   479105  2
...


The 497105 number is steadily rising ...

and now, about 30 mins later I am at:

rtc irq8   938264  4

--pt




On Mon, 26 Aug 2002, Lars Eggert wrote:

 Patrick Thomas wrote:
  Now, when I repeat vmstat -i, all of these numbers (or rather, all of the
  large numbers) increase _except_ for `rtc irq8`.

 interrupt   total   rate
 mux irq114851 12
 ata0 irq14  94219240
 atkbd0 irq1   399  1
 fdc0 irq6   2  0
 ppc0 irq7   1  0
 clk irq039123100
 Total  138595354

 Large ones increasing, too, but I don't seem to have rtc.

  Further, regarding the APM conjecture, this is a server and (although I
  may be mistaken) does not have APM in the bios at all - I have also
  removed it from the kernel.  dmesg tends to confirm the absence of APM.

 Mine's a laptop with APM enabled (BIOS + kernel).

 Lars
 --
 Lars Eggert [EMAIL PROTECTED]   USC Information Sciences Institute



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: top shows all zeroes.

2002-08-26 Thread Patrick Thomas


I will note that my system is a dual processor system, no APM hardware in
it, and I have an identical machine running a kernel built from an
identical kernel configuration file running an identical FreeBSD system
that has _never_ had the problem.



On Mon, 26 Aug 2002, Bruce M Simpson wrote:

 On Mon, Aug 26, 2002 at 11:02:50AM -0700, Peter Wemm wrote:
  This has happened before.  For some reason, the RTC stops sending the 128Hz
  statclock (statistics clock) interrupts.  One way to unwedge that in the past
  was to break into ddb and do a 'show rtc' command.. but that is hardly a
  solution.  I thought we had solved this problem.
 
  APM however is a known culprit for causing badness here.

 I should add that my Vaio has APM compiled into the kernel. I've also done
 the vmstat -i inspection briefly, all interrupt counters seem to be
 incrementing as normal. This problem may have cropped up after a set of
 suspend/resume sequences; right now I've had 3 warm reboots since
 yesterday (the laptop has been plugged in and unmoved), the problem has not
 yet manifested itself, but when I last noticed it, I had been suspending
 and resuming between leaving home and work.

 I realize this is purely anecdotal but I'll continue to observe for the
 problem re-emerging.

 BMS



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: top shows all zeroes.

2002-08-25 Thread Patrick Thomas


No, world and kernel out of sync is _not _ the problem in my case - I made
4.6.1-RC2 diskettes and did a ftp installation - so there was no upgrading
involved.

Further, this is an intermittent problem - sometimes it happens, sometimes
it doesn't.  I think some people have reported it on non RC2 4.6-RELEASE.

--pt

On Sat, 24 Aug 2002, Brian T. Schellenberger wrote:

 On Saturday 24 August 2002 12:00 pm, Patrick Thomas wrote:
 | And more important;y, does anyone know _why_ it is happening and what
 | it means for a system affected ?

 It usually means that the kernel and the world are out of sync.  How did
 you update to 4.6.1-RC2?


 |
 | On Sat, 24 Aug 2002, Bruce M Simpson wrote:
 |  On Sat, Aug 24, 2002 at 12:23:45AM -0700, Patrick Thomas wrote:
 |   I have seen this twice on 4.6.1-RC2:
 | 
 |  [..]
 | 
 |   CPU states:  0.0% user,  0.0% nice,  0.0% system,  0.0%
 |   interrupt,  0.0% idle
 | 
 |  [..]
 | 
 |  This is happening on my Vaio also; has anyone filed a PR?
 | 
 |  FreeBSD triage.dollah.com 4.6-STABLE FreeBSD 4.6-STABLE #0: Tue Aug
 |  20 13:00:06 BST 2002
 |  [EMAIL PROTECTED]:/usr/src/sys/compile/TRIAGE  i386
 | 
 |  BMS
 |
 | To Unsubscribe: send mail to [EMAIL PROTECTED]
 | with unsubscribe freebsd-hackers in the body of the message

 --
 Brian, the man from Babble-On . . . .   [EMAIL PROTECTED] (personal)



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: top shows all zeroes.

2002-08-25 Thread Patrick Thomas


 It's usually gone after a reboot. Haven't debugged it further since I
 saw now other problems.

Yes, but other times it is not manifesting, and it _starts_ after a
reboot.

Also, concerning solving the problem with a reboot, although my system is
merely a test machine, I am fairly certain that a considerable number of
people are using FreeBSD release versions as important and mission
critical pieces of their businesses...so this would not be an option for
some folks (rebooting frequently to solve).

_I_ understand that nobody but actual freeBSD developers have any business
relying on it for anything critical, but I am not sure that has been made
public successfully.  That is, I think a fair number of people are taken
by surprise by things like this.  Just my two cents.

--ptat


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: top shows all zeroes.

2002-08-25 Thread Patrick Thomas


 Well, the actual *release* versions *are* supposed to be reliable for
 mission-critical applications.  The purpose of the RC and STABLE
 versions being to find problems so that they don't make it to the
 release versions.


A lofty goal, indeed.  However it has been pointed out that this problem
manifests itself in plain old 4.6-RELEASE.

Also, just to add a bit more info, sometimes instead of rebooting to solve
the problem, the problem doesn't exist, and rebooting causes it to
manifest.  So it seems fairly random.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: possible to expand a file for vn-device FS usage ?

2002-08-16 Thread Patrick Thomas


Thank you for the very clear explanation.  Does there exist a utility to
immediately take a partition that has been growfs'd and fix it so that
it does not experience this performance penalty ?

That is, I am willing to sit and wait 10 minutes while some utility
rearranges and reorganizes the unmounted filesystem if it means I don't
have to dump/restore/blah/blah and if it allows me to avoid the
performance penalty you mentioned...

thanks,

PT



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



possible to expand a file for vn-device FS usage ?

2002-08-15 Thread Patrick Thomas


I have a 500meg file that I dd'd and have mounted as a vn-device
filesystem.  I would like to increase this to 1gig, however it is very
time consuming to do a dump of the FS to a file, dd a new larger one, then
do a restore (I have many special files in the FS, thus the need for
dump).

Is there a procedure wherein I can just unmount the file, expand it, then
remount it ?  I realize some trickery is needed as the newfs originally
done on the file will then be wrong, etc. - possibly disklabel as well -
but I am willing to run a new disklabel and/or newfs command on the file
in addition to expanding it.

Any suggestions on how to expand that file without doing the dump/restore
steps ?

thanks,

PT



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: possible to expand a file for vn-device FS usage ?

2002-08-15 Thread Patrick Thomas


What is the negative effect of this fragmentation, and does it mean I
won't be able to use all of the space that I added ?


On Thu, 15 Aug 2002, Terry Lambert wrote:

 Daniel O'Connor wrote:
  On Thu, 2002-08-15 at 17:04, Patrick Thomas wrote:
   Any suggestions on how to expand that file without doing the dump/restore
   steps ?
 
  man 8 growfs perchance? :)

 You can unmount it, grow the underlying file with:

   dd if-/dev/zero bs=XXX,count=XXX  filename

 and *THEN* use growfs(8) on it.

 Doing this will leave the allocation layout in the same state
 that it is at present, so the bottom half of the FS will end
 up fragmented, even though there is free space at the top (FS
 growing does not equally redistribute the FS content into the
 newly enlarged space).

 The best approach is the same as it would be for a device:
 dump and restore the FS from the old image to the new.  In
 the vn device case, you could just create a new empty FS of
 the necessary size, and dump from the old piped to a restore
 of the new.

 If you can live with the internal fragmentation, use growfs(8);
 if you can't, use dump/restore.  IMO, you will have less
 potential for future problems if you use dump/restore.

 -- Terry



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



status of UDP in jail bug ?

2002-08-09 Thread Patrick Thomas


There is (was?) a problem with jail that, among other things, made it
impossible for an ircd server to perform reverse lookups for clients.

In the news archives, there were complaints about this, and after a not so
good patch, eventually a good patch was posted by:

From: Lamont Granquist ([EMAIL PROTECTED])
   Subject: UDP jail bug patch (was Re: (PATCH) Re: jail bug with
   ircd-hybrid

I have two questions:

1. Does anyone know which versions of FreeBSD this patch will work on ?
2. Do I need this patch anymore on 4.6 and above ?


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



resolver workaround conceptually possible ?

2002-07-16 Thread Patrick Thomas


I am under the impression that at this time there is no workaround for the
resolver problem - you are forced to reinstall or upgrade.

I am curious though, is it at least conceptually possible that there could
be a workaround ?  If so, what would it entail ?

thanks - pt


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: resolver workaround conceptually possible ?

2002-07-16 Thread Patrick Thomas


 Assuming that bind9 has been fixed, you could use bind9 for your local
 resolver and it will filter anything nasty out as a side effect of the
 fact that it always constructs replies, rather than caching a reply and
 forwarding the reply as-is to the resolver client (as bind8 does).

Thank you very much.  I would like to clarify two things - first, that I
can fix bind9 by simply grabbing the tarball, configure;make;make
install  ...  or do I have to change libraries on the system itself and
otehrwise rearrange things in order for bind9 to compile fixed ?

That is, just update bind9 as normal ?

Second, again, just to clarify, this is a full fix ?  Once someone does
this they can rest easy ?

thanks a lot!


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Should I be concerned ?

2002-07-06 Thread Patrick Thomas


I saw this show up all over my ssh session into a server today:


NOTICE:  --Relation pg_toast_16386--
NOTICE:  Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0,
Keep/VTL 0/0, UnUsed 0, MinLen 0, MaxLen 0; Re-using: Free/Avail. Space
0/0; EndEmpty/Avail. Pages 0/0.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
NOTICE:  Index pg_toast_16386_idx: Pages 1; Tuples 0.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
NOTICE:  Analyzing pg_relcheck
NOTICE:  --Relation pg_rewrite--
NOTICE:  Pages 4: Changed 0, reaped 0, Empty 0, New 0; Tup 23: Vac 0,
Keep/VTL 0/0, UnUsed 0, MinLen 104, MaxLen 1456; Re-using: Free/Avail.
Space 8496/8496; EndEmpty/Avail. Pages 0/4.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
NOTICE:  Index pg_rewrite_oid_index: Pages 2; Tuples 23.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
NOTICE:  Index pg_rewrite_rulename_index: Pages 2; Tuples 23.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
NOTICE:  Rel pg_rewrite: Pages: 4 -- 4; Tuple(s) moved: 0.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
NOTICE:  --Relation pg_toast_16410--
NOTICE:  Pages 2: Changed 0, reaped 0, Empty 0, New 0; Tup 5: Vac 0,
Keep/VTL 0/0, UnUsed 0, MinLen 163, MaxLen 2034; Re-using: Free/Avail.
Space 8088/8088; EndEmpty/Avail. Pages 0/2.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
NOTICE:  Index pg_toast_16410_idx: Pages 2; Tuples 5.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
NOTICE:  Rel pg_toast_16410: Pages: 2 -- 2; Tuple(s) moved: 0.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
NOTICE:  Analyzing pg_rewrite
NOTICE:  --Relation pg_statistic--
NOTICE:  Pages 6: Changed 6, reaped 3, Empty 0, New 0; Tup 98: Vac 98,
Keep/VTL 0/0, UnUsed 8, MinLen 80, MaxLen 668; Re-using: Free/Avail. Space
26560/26484; EndEmpty/Avail. Pages 0/4.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
NOTICE:  Index pg_statistic_relid_att_index: Pages 2; Tuples 98: Deleted
98.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
NOTICE:  Rel pg_statistic: Pages: 6 -- 3; Tuple(s) moved: 90.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
NOTICE:  Index pg_statistic_relid_att_index: Pages 2; Tuples 98: Deleted
90.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
NOTICE:  --Relation pg_toast_16408--
NOTICE:  Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0,
Keep/VTL 0/0, UnUsed 0, MinLen 0, MaxLen 0; Re-using: Free/Avail. Space
0/0; EndEmpty/Avail. Pages 0/0.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
NOTICE:  Index pg_toast_16408_idx: Pages 1; Tuples 0.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
DEBUG:  recycled transaction log file 00B6



Any ideas as to what this means and what I should do (if anything) about
it ?

thanks,

pat


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



using `restore` without user input

2002-07-03 Thread Patrick Thomas


I would like to perform a restore out of a shell script.  Normally, I run
restore with a command line like:

restore -x -f /some/dump

Which works _exactly_ as I want it to, except that I am asked two
questions:

Specify next volume #:

and then at the end of the restore:

set owner/mode for '.'? [yn]

So that is a problem, since I want to run it unattended, without requiring
user input.  I have discovered that this command line:

restore -rf /some/dump

will run without user input.  MY question is, is the output of this
command identical to the output of the original one I was running ?  I
_do_ indeed wish to specify owner/mode for '.' and have everything restore
just right like it was with my original command line - am I missing
anything or losing any of my original functionality by using this new
command line ?  Or is it identical in result (except for the extra
`restoresymtable` file it produces) to the original command I had ?

thanks,

PT


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: tunings for many httpds...

2002-06-25 Thread Patrick Thomas


 Incidently, looking at the PV entry angle for a moment.  Suppose you
 create a 1GB sysvshm (pageable) segment.  That's 262144 pages.  Mapping this
 once means you consume 262144 PV entries.  At 28 bytes each, that is
 about 7.3MB of KVM.  Now, fork this process 300 times.  The numbers become
 78643200 PV entries taking up about 2.2GB of PV entries that would like to fit
 in the 1G KVA space.  We dont even nearly have a way to fit all this in.

 This is the killer reason for SHM_PHYS stuff.  It avoids the PV load which
 has to fit into a single confined space.  The cost of the page table pages
 sucks, but at least that is spread over the VM space of 300 processes.

Ok, I'm confused now - so I understood you to originally say that SHM does
not eat into KVA regardless of whether I set the kern.ipc.shm_use_phys to
'1' or not.

This leads me to conclude that setting that sysctl to 1 will probably not
be the magic bullet to stop my system from inexplicably halting. (my
system with greatly (4x) increased SHM/SEM/etc. settings)

But now in this post ... are you saying that from the PV entry angle
that KVA _is_ sometimes used for SHM, when we create a pageable segment ?

Or are you just providing a thought experiment and pointing out that if it
_were_ done this way then XYZ bad things would occur ?

thanks,

PT


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: (jail) problem and a (possible) solution ?

2002-06-24 Thread Patrick Thomas


Terry,

I made an initial change to the kernel of reducing maxusers from 512 to
256 - you said that 3gig is right on the border of needing extra KVA or
not, so I thought maybe this unnecessarily high maxusers might be puching
me over the top.  However, as long as I was changing the kernel, I also
added DDB.

The bad news is, it crashed again.  The good news is, I dropped to the
debugger and got the wait channel info you wanted with `ps`.  Here are the
last four columns of ps output for the first two pages of processes
(roughly 900 procs were running at the time of the halt, so of course I
can't give you them all, especially since I am copying by hand)

3   select  c0335140local
3   select  c0335140trivial-rewrite
3   select  c0335140cleanup
3   select  c0335140smtpd
3   select  c0335140imapd
2   httpd
2   httpd
3   sbwait  e5ff6a8chttpd
3   lockf   c89b7d40httpd
3   sbwait  e5fc8d0chttpd
2   httpd
3   select  c0335140top
3   accept  e5fc9ef6httpd
3   select  c0335140imapd
3   select  c0335140couriertls
3   select  c0335140imapd
2   couriertls
3   ttyin   c74aa630bash
3   select  c0335140sshd
3   select  c0335140tt++


So there it all is.  Does this confirm your feeling that I need to
increase KVA?  Or does it show you that one of the one or two other low
probablity problems is occurring?

thanks,

PT


On Sun, 23 Jun 2002, Terry Lambert wrote:

 Patrick Thomas wrote:
  I think I'll just decrease my swap size from 2 gigs to 1 gig - is that a
  reasonable alternative that provides the same benefit and possible
  solution to this problem ?
 
  ...since bsically 0 swap has ever been used on the machine anyway...

 Not really.

 The code in machdep.c allocated pmaps for swapped memory based
 on the size of real memory, rather than based on available swap.

 The reason it does this is that you can (effectively) add an
 arbitrary amount of swap later with swapon, without the swap
 devices at the time being known to the kernel at boot.  THis
 makes it impossible to prereserve the number of pmap pages that
 will be needed for the actual amount of swap.

 Matt Dillon made some autosizing changes after I complained
 about this before.  My actual complaint was to implicate the
 size of real memory available relative to the size of the full
 address space.  The change he made attempts to autosize, and
 doesn't quite mirror this policy directly.  THis code is not
 available in 4.5.  I believe that it was back-ported to 4.6,
 but you would have to look at the CVS log on machdep.c to be
 sure about this -- it may only be in -current.

 The upshot of this is that having a lot of memory reserves
 pmap entries at 4K per 4M of real OR virtual memory.  The
 result of this is that at 4G of physical RAM, you actually
 end up allocating more pmap's than 1G of memory can contain,
 since the total of physical RAM plus swap over 1024 is
 larger than 1G minus the amount taken by an idle kernel, not
 including the page mappings.

 If you have 3G of real RAM (which you do), then you are on
 the borderline of running out.  When you factor in the amount
 of *potential* swap that machdep.c reserves, plus tuning for
 maxfiles/sockets/inpcb/tcpcb/mbufs/etc. (if any), PLUS the
 RAM taken up for things associated with running over 1000
 processes (as your system does), then you end up exhausting
 the amount of VM space available.

 As I said before, though, the only way to know for sure if
 this is your real problem is to break to the debugger after
 the lockup (it's *not* a crash), and check out the wait channels
 for the processes thar are unable to run.

 If you want a tweak for 4.5 that has about a 95% proability of
 masking the problem, then you need to up the KVA space.

 Unfortunately, it's not really possible to tell you where
 every byte of memory is going.  Also, unfortunately, the
 pmap's for swappable memory are not themselves swappable
 (or this would not be a problem).  Probably, pmaps for
 swap and for file backing store for exectuables should be
 allocated when they are needed, not preallocated (they can
 be, if you are not out of RAM, or have RAM, but are out of
 KVA space in which to create mappings) [see growkernel].

 Taking out 1G of physical memory from the box might also
 fix the problem without a kernel tweak, FWIW.

 However, right now, you need to cause the problem, enter
 the debugger, and use ps in the debugger to examine the
 wait channels.

 -- Terry



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: (jail) problem and a (possible) solution ?

2002-06-24 Thread Patrick Thomas


A few items that deserve mention, and two questions:

a) this problem occurred back when the machine had 2gigs in it - I
actually (naively) added the third gig of physical ram to try to fix the
problem.

b) another machine of mine is now exhibiting the same bahavior - it has
far fewer processes running (~500 vs ~1000) and it has only 2gigs of RAM.

questions:

1) How do I give you an entire `ps` output from DDB ?  Is there a way to
output it to a floppy or something ?  Or are you suggesting to copy down
by hand ~1000 lines of ps output ?

2) Any other suggestions as to what it is - if it doesn't look like KVA,
and I reduced my swap from 2gig to 256megs, and I reduced maxusers from
512 to 256 ... basically I have a perfectly healthy machine that crashes
for no reason ?

All of your help is greatly appreciated.  It's just so frustrating to have
it halt every day for no apparent reason - as you saw from the `top`
output just as it halted the other day , the load is trivial.

--PT


On Mon, 24 Jun 2002, Matthew Dillon wrote:

 Well, it should be noted that there are two things going on with swap.
 What I adjusted was the size of the swap_zone, which holds swblocks.
 These structures hold the VM-SWAP block mappings for things that are
 swapped out.  The swap zone eats a lot more KVA then the radix tree
 holding the swap bitmaps.

 The actual swap bitmaps are allocated from the M_SWAP malloc pool.  These
 allocations are based on NSWAP * (largest_single_swap_area).  NSWAP
 is usually 4.

 Having a single 2GB swap area is therefore somewhat expensive, but still
 nowhere near the size required to exhaust KVM (or even come close to
 exhausting KVM).  It is just as expensive as having 4 x 2GB swap areas
 due to the way the bitmaps are allocated.  The swap bitmaps eat around
 2 bits per 4K block of swap so a single 2GB of swap will eat
 2G/4K x 2 / 8 x NSWAP(4) = 0.5 MB of ram.  Not very much.

 But, getting back to the swblocks... these use a zone, SWAPMETA
 (vmstat -z | less, search for SWAPMETA).  The zone reserves KVA.
 A machine with 2GB of real memory will typically reserve around 10 MB
 of KVA to hold swblocks.  Previously it reserved 20-40 MB of KVA which
 really ate into available KVA.  It should not be a problem now but
 it's very easy for you to check.  Multiply the size (160) against the
 LIMIT and you will get the approximate KVA reservation being used
 for the SWAPMETA zone.

 --

 Ok, history lesson over.  Going over your original posting and the ps
 you just posted from ddb there is not enough information to make
 any sort of diagnosis.  It doesn't look like KVA exhaustion to me,
 and the ps does not show any deadlocks.  I'm not sure what is going
 on.  I think some more experimentation is necessary... e.g. breaking into
 DDB after it deadlocks and doing a full 'ps' (don't leave anything out
 this time), and potentially getting a kernel core dump (assuming you
 compiled the kernel -g and have a kernel.debug lying around that we
 can gdb the core against).

   -Matt
   Matthew Dillon
   [EMAIL PROTECTED]




To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



tunings for many httpds...

2002-06-24 Thread Patrick Thomas


As a splinter to the ongoing KVA/crash/memory discussion, I am wondering:

- given a machine that will run 250+ httpds and another ~800 misc.
processes, what system tunings would any of you suggest other than the
ones I have done:


In my kernel:   maxusers=256 (was 512, change to 256 didn't help)
options SHMMAXPGS=16384
options SHMMAX=(SHMMAXPGS*PAGE_SIZE+1)
options SHMSEG=256
options SEMMNI=384
options SEMMNS=768
options SEMMNU=384
options SEMMAP=384
(all this SHM and SEM stuff is to run multiple postgres')
and at boot time:
sysctl -w jail.sysvipc_allowed=1
sysctl -w kern.ipc.shmall=65535
sysctl -w kern.ipc.shmmax=134217728
sysctl -w net.inet.tcp.syncookies=0


Anything obvious I am missing ?  Terry seems to think:

quote
It's obvious that you are running a large number of httpd's; the
sbwait in this case could be reasonably assumed to be waits based
on sendfile for a change in so-so_snd-sb_cc; if that's the
case, then it may be that you are simply running out of mbufs,
and are deadlocking.  This can happen if you have enough data in
the pipe that you can not receive more data (e.g. the m_pullup()
in tcp_input() could fail before other things would fail).
/quote

Two things about this interested me:

a) watching `top` output anytime of the day, i see several httpd processes
in sbwait - granted I can only see 40 lines of processes or so in `top`,
but usually at least two show sbwait.  Worrisome ?

b) As I showed him, the netstat -m output 30-60 seconds before the crash
looks very benign:

524/2576/34816 mbufs in use (current/peak/max):
500 mbufs allocated to data
24 mbufs allocated to packet headers
273/2254/8704 mbuf clusters in use (current/peak/max)
5152 Kbytes allocated to network (19% of mb_map in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines

Is it possible that within 30 seconds or so current mbufs would skyrocket
and my percentage of mb_map in use would skyrocket and I would start to
see requests for memory denied ?

All comments appreciated.

--PT


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: (jail) problem and a (possible) solution ?

2002-06-24 Thread Patrick Thomas


 It's obvious that you are running a large number of httpd's; the

Yes, we are running a lot of httpd's:

ps auxw | grep httpd | wc -l = 288

 The way to cross-check this would be to run a continuous netstat -m,
 e.g.:

Funny you should ask :)  I was already doing that.  Here is the output
from a `netstat -m` run once per minute - the machine crashed sometime in
the next 30-60 seconds after I got this output:

524/2576/34816 mbufs in use (current/peak/max):
500 mbufs allocated to data
24 mbufs allocated to packet headers
273/2254/8704 mbuf clusters in use (current/peak/max)
5152 Kbytes allocated to network (19% of mb_map in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines


 Basically, if you have any denials, or if the number of mbuf
 clusters gets really large, then you could have a problem.

Do you think it is reasonable that the above netstat -m output could,
within 30 or so seconds, ramp up to the bad situation you are describing ?
Because it looks fairly benign to me...


I have three questions:

1. Forgetting about my paticular problem for a moment, let's say you have
to tune a machine to run 200+ httpd servers along with another 800 misc.
processes, etc.  What do you suggest setting, just to be safe (again, as a
precaution - forgetting that in reality I am tryig to fix a sick machine)
So far I have only tuned:

In my kernel:   maxusers=256 (was 512, change to 256 didn't help)
options SHMMAXPGS=16384
options SHMMAX=(SHMMAXPGS*PAGE_SIZE+1)
options SHMSEG=256
options SEMMNI=384
options SEMMNS=768
options SEMMNU=384
options SEMMAP=384
(all this SHM and SEM stuff is to run multiple postgres')

and at boot time:
sysctl -w jail.sysvipc_allowed=1
sysctl -w kern.ipc.shmall=65535
sysctl -w kern.ipc.shmmax=134217728
sysctl -w net.inet.tcp.syncookies=0

So anything obvious I am missing that you would tune for a 200+ http + 800
other processes machine?



2. Let's say I was being targeted by that effective attack you spoke
of...any way to immunize myself ?


3. You spoke of:

   # sysctl -a | grep tcp | grep space
   net.inet.tcp.sendspace: 32768
   net.inet.tcp.recvspace: 65536

 I guess the best way to deal with this would be to drop the size
 of the send or receive queues, until it didn't consume all your
 memory.  In general, the size of these queues is supposed to be
 a *maximum*, not a *mean*, so the number of sockets possible,
 times the maximum total of both, will often exceed the amount of
 available mbuf space.

a) are you saying to collect these sysctls regularly and
try to see their values right at the crash ?

b) where do I drop the size of the send or receive queues ?
(sysctl or kernel setting?)


thank you very much.  I will try to get a full `ps` tonight when it
crashes again :(

--PT








 An interesting attack that is moderately effective on FreeBSD
 boxes is to send with a very large size, and not send one of
 the fragments (e.g. the second one) to prevent fragment
 reassembly, and therefore saturate the reassembly queue.  The
 Linux UDP NFS client code does this unintentionally, but you
 could believe that someone might be doing it intentionally,
 as well, which would also work against TCP.  It's doubtful that
 you are being hit by a FreeBSD targetted attack, however.

 -- Terry



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: (jail) problem and a (possible) solution ?

2002-06-23 Thread Patrick Thomas


ok.  I was just looking back at a previous comment you made:

 Amusingly enough, you might actually have *better* luck with a
 lot less swap...

and thinking that even if removing most of the swap did not _solve/mask_
the problem, at least it would be a step in the same direction as upping
KVA (even if it is not as large a step)  but if that is not the case...

...then, has anyone written a HOWTO on upping it in 4.5-RELEASE ?  You
mentioned to look back over your own old posts on the subject - before I
jump in and try it, I want to confirm what I believe to understand, I need
to set the KVA value in my kernel config _and_ edit those other two files
in the kernel source, then just recompile my kernel.

Sound like I'm on the right track ?

Terry, thanks again for your help and for all the help you regularly give
to other people pursuing items such as this on the various FreeBSD lists.

--PT




On Sun, 23 Jun 2002, Terry Lambert wrote:

 Patrick Thomas wrote:
  I think I'll just decrease my swap size from 2 gigs to 1 gig - is that a
  reasonable alternative that provides the same benefit and possible
  solution to this problem ?
 
  ...since bsically 0 swap has ever been used on the machine anyway...

 Not really.

 The code in machdep.c allocated pmaps for swapped memory based
 on the size of real memory, rather than based on available swap.

 The reason it does this is that you can (effectively) add an
 arbitrary amount of swap later with swapon, without the swap
 devices at the time being known to the kernel at boot.  THis
 makes it impossible to prereserve the number of pmap pages that
 will be needed for the actual amount of swap.

 Matt Dillon made some autosizing changes after I complained
 about this before.  My actual complaint was to implicate the
 size of real memory available relative to the size of the full
 address space.  The change he made attempts to autosize, and
 doesn't quite mirror this policy directly.  THis code is not
 available in 4.5.  I believe that it was back-ported to 4.6,
 but you would have to look at the CVS log on machdep.c to be
 sure about this -- it may only be in -current.

 The upshot of this is that having a lot of memory reserves
 pmap entries at 4K per 4M of real OR virtual memory.  The
 result of this is that at 4G of physical RAM, you actually
 end up allocating more pmap's than 1G of memory can contain,
 since the total of physical RAM plus swap over 1024 is
 larger than 1G minus the amount taken by an idle kernel, not
 including the page mappings.

 If you have 3G of real RAM (which you do), then you are on
 the borderline of running out.  When you factor in the amount
 of *potential* swap that machdep.c reserves, plus tuning for
 maxfiles/sockets/inpcb/tcpcb/mbufs/etc. (if any), PLUS the
 RAM taken up for things associated with running over 1000
 processes (as your system does), then you end up exhausting
 the amount of VM space available.

 As I said before, though, the only way to know for sure if
 this is your real problem is to break to the debugger after
 the lockup (it's *not* a crash), and check out the wait channels
 for the processes thar are unable to run.

 If you want a tweak for 4.5 that has about a 95% proability of
 masking the problem, then you need to up the KVA space.

 Unfortunately, it's not really possible to tell you where
 every byte of memory is going.  Also, unfortunately, the
 pmap's for swappable memory are not themselves swappable
 (or this would not be a problem).  Probably, pmaps for
 swap and for file backing store for exectuables should be
 allocated when they are needed, not preallocated (they can
 be, if you are not out of RAM, or have RAM, but are out of
 KVA space in which to create mappings) [see growkernel].

 Taking out 1G of physical memory from the box might also
 fix the problem without a kernel tweak, FWIW.

 However, right now, you need to cause the problem, enter
 the debugger, and use ps in the debugger to examine the
 wait channels.

 -- Terry



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: inuring FreeBSD to the apache bug without upgrading apache ?

2002-06-23 Thread Patrick Thomas



 Yeah; this whole thread is premised on working around the
 problem without an Apache software change.  It's a reasonable
 premise (IMO) -- if you've got a custom compilation and a lot
 of modules, that can end up being a lot of software.  I build
 a PHP4+SSL+Apache+IMAP+etc. source tree at one point, and it
 ended up being ~1.2 million lines of code, all told, that had
 to be made to work together.  If you had just built it, then
 it would be very hard to update just one component without
 repeating the whole process.  My advice?  Use CVS.

Actually, this whole thread is premised on I have a dev system with 16
jailed apaches and it would be a pain to upgrade all 16 of them vs. just
making one global kernel/environment change.  It sounds like that is
probably a pipe dream though..

--PT


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: (jail) problem and a (possible) solution ?

2002-06-23 Thread Patrick Thomas



  jump in and try it, I want to confirm what I believe to understand, I need
  to set the KVA value in my kernel config _and_ edit those other two files
  in the kernel source, then just recompile my kernel.
 
  Sound like I'm on the right track ?

 Yes.  That's the way to do it for 4.5, specifically.

Because I am paranoid, I like to check the state of a measurement before
making a change and then after, to see that what I did did indeed induce a
change ... I have this irrational fear that sometimes I make changes like
this and nothing in fact changed, and I just don't know it :)

So, should I just look for the value of:

vm.zone_kmem_kvaspace: 179691520

to increase in size even though the physical RAM stays the same at 3gigs,
or is there some other measurement I should look at before and after the
KVA increase to ensure that it worked (and yes, I know that if it doesn't
work I probably will have an inoperable machine, but just out of
curiousity...)

thanks,

PT


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: (jail) problem and a (possible) solution ?

2002-06-22 Thread Patrick Thomas


What it does is the userland hangs, but the kernel keeps running.

When the system is crashed, I can still ping it successfully, and I can
still open sockets (like I can open a connection to a jails httpd or sshd,
or the sshd of the underlying server itself) but nothing answers on the
sockets - they just hang open.

So everything stops running, but it is still up - still responds to
pings...syslog stops logging though, cron stops running

Two questions for you:

1) do you allow them write access to their /dev/mem, /dev/kmem, /dev/io ?

2) does this sound like what you see?  Can you still ping the crashed
server ?

I'm mostly just curious if this kind of crash (userland hung but kernel
running) is a possible outcome of someone in a jail fiddling with those
/dev nodes, or if fiddling with dev/mem or /dev/kmem or io would just lock
the machine up hard and completely.

Terry?

--PT



On Fri, 21 Jun 2002, Nielsen wrote:

 Yes I've had the same problem. One system runs just fine with it's jails,
 and another crashes habitually. It has to do with a certain jail (and
 services). Our system are set up to be able to move jails between them
 (great for backups and near perfect uptime), and a certain set of jails
 always hangs the system in this way. I'm trying to narrow it down. Do you
 get a core dump or does it just hang?

 Nate

 - Original Message -
 From: Patrick Thomas [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Sent: Friday, June 21, 2002 16:43
 Subject: (jail) problem and a (possible) solution ?


 
  A test server of mine running a number of jails keeps locking up - but the
  odd thing about the lockup is that the userland stops, but the kernel
  keeps running
 
  (sockets can be opened, but the servers never respond on them, the machine
  still responds to pings, but logs show that all real activity stops)
 
  I just noticed today that some jails still have writable /dev/mem and
  /dev/kmem and /dev/io nodes.  I think it is plausable that some kind of
  fiddling (writing) to these nodes is causing this kind of lockup.
 
  
 
  Is this assumption reasonable, or if some jail user fiddled with their
  /dev/mem or /dev/kmem or /dev/io node would it just totally crash out the
  machine and I _wouldn't_ still be able to ping the server after it crashes
  ?
 
  thanks,
 
  PT
 
 
  To Unsubscribe: send mail to [EMAIL PROTECTED]
  with unsubscribe freebsd-hackers in the body of the message
 




To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: (jail) problem and a (possible) solution ?

2002-06-22 Thread Patrick Thomas


Terry,

Thanks for that informative email - just a quick reality check though (for
myself) - the last time this type of crash happened, I was running and
watching `top` on the machine - and when it froze, the `top` output froze
as well, and this was the last display on the screen:


last pid:  6603;  load averages:  3.81,  1.84,  1.48
1032 processes:1 running, 1026 sleeping, 5 zombie
CPU states:  1.8% user,  0.8% nice,  3.2% system,  0.1% interrupt, 94.1%
idle
Mem: 1129M Active, 1404M Inact, 351M Wired, 103M Cache, 199M Buf, 28M Free
Swap: 2018M Total, 2732K Used, 2015M Free



Since all of the things you spoke of basically revolved around you're
running out of memory, is it possible or reasonable to think that within
the space of 1 second, I ran through 1404 megs inactive and 28 megs free
memory ?

machine is 4.5-RELEASE with 3gigs ram.  swap never gets touched, although
there is in fact 2gigs of swap.  `pstat -s` always shows 0% used.

I'll do the debug actions you suggested.

--PT



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: (jail) problem and a (possible) solution ?

2002-06-22 Thread Patrick Thomas


How do you increase KVA space these days ?  I see that in earlier releases
you had to edit /sys/conf/ldscript.i386 and /sys/i386/include/pmap.h and
do all sorts of crazy stuff.

What is the procedure in 4.5-RELEASE (please say just change
KVA_PAGES=260 to KVA_PAGES=512)

That's what you want me to do, right ?  Is that all - can it be done just
by changing that one value in my kernel config ?

Again, thank you Terry for all your help.

--PT


On Sat, 22 Jun 2002, Terry Lambert wrote:

 Patrick Thomas wrote:
  Since all of the things you spoke of basically revolved around you're
  running out of memory, is it possible or reasonable to think that within
  the space of 1 second, I ran through 1404 megs inactive and 28 megs free
  memory ?
 
  machine is 4.5-RELEASE with 3gigs ram.  swap never gets touched, although
  there is in fact 2gigs of swap.  `pstat -s` always shows 0% used.

 OK, there's memory, and then there's memory.

 The amount of swap you have, the fact that it's 4.5, and the
 amount of RAM you have imply to me that the problem is that
 you are out of pmap entries.

 You should up your KVA space to 2G or maybe even 3G; the default
 in 4.5 was 1G.

 Basically, I now think that you don't have enough memory to map
 how much memory and virtual memory you have.

 Amusingly enough, you might actually have *better* luck with a
 lot less swap...

 If your KVA space is already enlarged above the default, then
 you can ignore this and just go ahead with the debugging to see
 what the wait channels for all the processes that won't run are
 stuck at.

 -- Terry



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: (jail) problem and a (possible) solution ?

2002-06-22 Thread Patrick Thomas


I think I'll just decrease my swap size from 2 gigs to 1 gig - is that a
reasonable alternative that provides the same benefit and possible
solution to this problem ?

...since bsically 0 swap has ever been used on the machine anyway...

--PT

On Sat, 22 Jun 2002, Terry Lambert wrote:

 Patrick Thomas wrote:
  How do you increase KVA space these days ?  I see that in earlier releases
  you had to edit /sys/conf/ldscript.i386 and /sys/i386/include/pmap.h and
  do all sorts of crazy stuff.
 
  What is the procedure in 4.5-RELEASE (please say just change
  KVA_PAGES=260 to KVA_PAGES=512)
 
  That's what you want me to do, right ?  Is that all - can it be done just
  by changing that one value in my kernel config ?

 It's what I want you to do.

 For 4.5, you have to hack ldscript.i386 and pmap.h.  I've posted
 on how to do this before (should be in the archives).

 The pages are all going to be off-by-one from your calculations,
 for the recursive page mapping, or off-by-two if your kernel is an
 SMP kernel, for the per CPU page, so remember that, or you will
 end up with a kernel that simply doesn't boot.

 The easiest way is to look at the numbers in pmap.h, and figure
 out how they relate to 0xc000 (remember to OR in 0x0010
 after your math, to count the kernel loading at 1M).

 -- Terry



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: inuring FreeBSD to the apache bug without upgrading apache ?

2002-06-21 Thread Patrick Thomas


What none of you has mentioned is the thought I had in mind when I asked
this question, and that is, I have a rd machine with 16 jails on it, each
running apache.

Therefore in a situation like this it would be _much_ easier to just tune
a sysctl or rebuild the kernel, vs. rebuilding 16 differently configured,
different versions of apache.  YMMV.

--PT

On Fri, 21 Jun 2002, Frank Mayhar wrote:

 Brandon D. Valentine wrote:
  However, I would ask Frank if there's a particular reason he needs to
  use Covalent Raven SSL.  OpenSSL is free, works like gangbusters, and
  comes with FreeBSD.  I have a feeling he'd be much happier with it if
  there's not some other reason he cannot move to it.

 As I mentioned, the two reasons are (1) it hasn't been broken (at least
 up to now) and (2) I haven't had time.  These are colocated production
 boxes; I don't have easy physical access to them to fix things if they
 go seriously wrong, and having them be down for any length of time is a
 Bad Thing.
 --
 Frank Mayhar [EMAIL PROTECTED]   http://www.exit.com/
 Exit Consulting http://www.gpsclock.com/



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



inuring FreeBSD to the apache bug without upgrading apache ?

2002-06-20 Thread Patrick Thomas


Is it possible to patch/recompile FreeBSD 4.5 in such a way that your
system is no longer vulnerable to the chunking attack, even if you are
still running a vulnerable apache ?

I ask because I see in one of the chunking exploits that:

* Remote OpenBSD/Apache exploit for the chunking vulnerability. Kudos to
 * the OpenBSD developers (Theo, DugSong, jnathan, *@#!w00w00, ...) and
 * their crappy memcpy implementation that makes this 32-bit impossibility
 * very easy to accomplish.

Which leads me to believe there are structures in the OS which help this
vulnerability to exist.  I am _very_ interested to find out if it is
possible to patch this bug at the FreeBSD OS level and not the apache
level.

thanks,

PT




To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



reboot your own jail ?

2002-05-16 Thread Patrick Thomas


currently I reboot jails with this process:

1. someone logs into the jail and runs `kill -KILL -1`
2. someone logs onto the BASE machine and starts it up again.

I wish I could do this without involving the admin of the base machine.

Has anyone come up with a strategy for allowing the root jail user to
successfully reboot their own jail without outside help ?

I can think of some horrible hacks involving constantly checking if the
jail is runningand if it ever stops (presumably someone rebooted it)
then start it again...hopefully there is sonhmething more elegant than
that.

--pt


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: reboot your own jail ?

2002-05-16 Thread Patrick Thomas


why -TERM ?  the jail man page recommends -KILL ... just curious...

On Thu, 16 May 2002, Marc G. Fournier wrote:


 web interface that is password protected that does:

   ssh root@jail kill -TERM -1
   restart jail




 On Thu, 16 May 2002, Patrick Thomas wrote:

 
  currently I reboot jails with this process:
 
  1. someone logs into the jail and runs `kill -KILL -1`
  2. someone logs onto the BASE machine and starts it up again.
 
  I wish I could do this without involving the admin of the base machine.
 
  Has anyone come up with a strategy for allowing the root jail user to
  successfully reboot their own jail without outside help ?
 
  I can think of some horrible hacks involving constantly checking if the
  jail is runningand if it ever stops (presumably someone rebooted it)
  then start it again...hopefully there is sonhmething more elegant than
  that.
 
  --pt
 
 
  To Unsubscribe: send mail to [EMAIL PROTECTED]
  with unsubscribe freebsd-hackers in the body of the message
 




To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



syncookies exploit behavior

2002-05-07 Thread Patrick Thomas



Two questions regarding the syncookies issue -

1. What kind of crash is it ?  I have an issue where my machine has no
response at the console, and none of the services work (pop, imap, etc.)
HOWEVER you can still ping it, and you can still initiate connections to
services - they just dont talk or respond at all - and cron jobs no longer
run.  Someone suggested that it looks like my userland is frozen, but my
kernel is still running.

Is that the kind of crash you get when you encounter the syncookies
problem ?


2. Is there any way to scour tcpdump on the _affected_ machine to see if
syncookies was indeed your problem ?  This is sort of two questions -
first, will the machine be crashed so fast it won't have time to write
tcpdump output to a file for the packet that caused the crash ?  and
second, if it is possible, what would that tcpdump output look like ?


I suspect you can't scour tcpdump for it, since this problem can be caused
by legitimate traffic.

comments appreciated,

PT


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: what causes a userland to stop, but allows kernel to continue?

2002-05-06 Thread Patrick Thomas



 No denied requests.  It's not mbufs.  It must be something else.


How do you feel about this:


# vmstat -z

ITEMSIZE LIMITUSEDFREE  REQUESTS

PIPE:160,0,702,522,   236316
SWAPMETA:160,   509724,452,136, 1125
unpcb:64,0,542, 98,  3398824
ripcb:   192,16424,  0, 42,3
syncache:160,15359,  0, 51,49824
tcpcb:   544,16424,353,957,64527
udpcb:   192,16424, 83, 45,   150821
socket:  192,16424,979,813,  3614256
KNOTE:64,0,  1,127,51798
DIRHASH:1024,0,   1740,268,36897
NFSNODE: 352,0,  0,  0,0
NFSMOUNT:544,0,  0,  0,0
VNODE:   192,0, 124417, 27,   124417
NAMEI:  1024,0,  0, 24, 151244479
VMSPACE: 192,0,875,533,  3797606
PROC:416,0,881,540,  3797656
DP fakepg:64,0,  0,  0,0
PV ENTRY: 28,  2690954, 601601, 266301, 2806153478
MAP ENTRY:48,0,  34223,   4070, 246626232
KMAP ENTRY:   48,   128821,   3795,514,   369055
MAP: 108,0,  7,  3,7
VM OBJECT:96,0, 132173,  10127, 97570617


anything interesting ?

thanks.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



what would cause a server to behave this way ?

2002-05-05 Thread Patrick Thomas


We have a FreeBSD 4.5-RELEASE server, it is a SMP system, and four days
ago the following happened:

- console became unresponsive - caps lock key no longer toggled the caps
lock button

- you _could_ still ping the server

- you could still establish connections to running services, but NONE of
those services would actually talk to you.  They would just establish
connection and then sit there.

Here is an example, trying to ssh into the machine:

# ssh -v [EMAIL PROTECTED]
SSH Version OpenSSH-2.1, protocol versions 1.5/2.0.
Compiled with SSL (0x0090581f).
debug: Reading configuration data /etc/ssh/ssh_config
debug: ssh_connect: getuid 0 geteuid 0 anon 0
debug: Connecting to example.com [1.2.3.4] port 22.
debug: Allocated local port 890.
debug: Connection established.


and that is as far as it would go  -  just sat there forever.  Same is
true with telneting to port 25 or port 110 or 53 - you would establish a
connection, but you would get no response or output from the server.

We eventually just had to power cycle.

---

So anyway, we are confused - we could still ping it, we could see that
processes (sshd server, mail server, etc.) were still running, and it even
looks like cron jobs continued to run - however, from the console it
looked like a classic hard lock (no caps light LED toggle).

This is a fairly heavily loaded system - in `top` idle CPU usually hovers
around 60%.  But we have never had any trouble in the past...

any comments/suggestions appreciated.

--PT


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



what causes a userland to stop, but allows kernel to continue ?

2002-05-05 Thread Patrick Thomas


So, based on a previous thread, it looks like I have a server whose
userland halted, essentially, but the kernel continued running.

As evidenced by:

- you can still ping the server just fine
- you can still connect to running services just fine - if you ssh to it,
`ssh -v` (verbose) claims a connection is established, but the server
doesn't respond in any way over that connection.  Further, you can telnet
to POP or IMAP or HTTP ports, and get a connection, but you can't get any
response.
- cron does NOT run while the server is in this state - no jobs run
- no response from the console - caps lock does NOT toggle the LED

So, as was suggested in the previous thread, it looks like my kernel is
still running, but the userland has halted.  There are no log entries that
give any clue as to why this happened last week.


1. from a theoretical standpoint, how would this happen ?
2. Is there any way to watchdog for it and escape from it before the
userland completely crashes ?
3. any previous/old problems that would cause this behavior ?


It is a FreeBSD 4.5-RELEASE system, and it is SMP - fairly heavily loaded
(averages 60% CPU idle in `top` output).

thanks,

PT



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: what causes a userland to stop, but allows kernel to continue?

2002-05-05 Thread Patrick Thomas


Are NMBCLUSTERS and mbuf determined by 'maxusers' ?

I have maxusers=512 ... comments ?

When you suggest 'clamp the total number of sockets that are permittedto
be open' ... how is this done - is there a sysctl that corresponds to
total number of sockets that are permitted to be open ?

I am also a little confused how this performance issue is solved by
_lowering_ a tunable value - all of my problems up to this point (ran out
of file descriptors, ran out of ptys, etc.) were solved by increasing
them.

Thank you for your help,

PT


On Sun, 5 May 2002, Terry Lambert wrote:

 Anthony Schneider wrote:
  Livelock, maybe?  Is there some sort of internal kernel semaphore table which
  might be getting filled up or something?  I'd also like to find out more about
  this, but sadly, the machine is a remote one and I can't drop into ddb as
  suggested...
  Thanks you all very much.  Hope this information is of use.
  -Anthony.

 More likely, you have run out of some non-renewable resource,
 such as mbufs, and are in the midst of a deadly embrace deadlock
 (e.g. as a result of having no mbufs to send responses or receive
 acknowledgements which would free up mbufs currently held for TCP
 sessions in progress, etc.).

 The easies way to see this is to periodically record vmstat -m
 and netstat -m output to a disk file, and sync, in order to make
 sure that it's recorded at the time you must reset.

 Then plot the information over time, up to the point of the failure,
 and you will likely see the problem in gory detail.

 If it is something like mbuf starvation, then you should clamp the
 total number of sockets that are permitted to be open at half the
 maximum window size divided into the number of mbufs available,
 minus 10% for a reserve.


 In general, the tuning page is broken; a number of the things it
 suggests tuning via systctl at run time are not actually tunable at
 run time, only at boot time.  Though at run time, they will remove
 the top end limits, they will in fact not result in the reservation
 of sufficient resource to meet those limits, as they would had they
 been in effect at boot time, instead.

 In particular, increasing the number of open files permitted by
 modifying maxfiles via sysctl at runtime will not add to the
 prereserved amount of tcpcb's, inpcb's, or socket structures,
 all of which could leave you starving for one of these objects,
 or the mbuf's needed to support them, at runtime.

 It pays to understand the code before fiddling the numbers.  ;^).

 -- Terry



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



RFC on my SHM tunings for multiple jailed postgres...

2002-05-03 Thread Patrick Thomas


I have a large server that will be running ~24 jails, 8 of which will be
running their own postgres server.

Because of this fact:

By default, Postgres allocates 34 semaphores, which is over half the
default system total of 60.

I need to tune kernel SHM settings in order to even run the second
postgres, much less the other six.  So, this is what I have in my kernel,
and I appreciate any comments or suggestions regarding the
appropriateness:


(of course, by default, I have)

options SYSVSHM #SYSV-style shared memory
options SYSVMSG #SYSV-style message queues

(and the following is _all_ that I have added)

options SHMMAXPGS=16384
options SHMMAX=(SHMMAXPGS*PAGE_SIZE+1)
options SHMSEG=256
options SEMMNI=384
options SEMMNS=768
options SEMMNU=384
options SEMMAP=384


My references for this are:

http://www.us.postgresql.org/users-lounge/docs/7.1/admin/kernel-resources.html

http://groups.google.com/groups?q=freebsd+SEMMNI+postgreshl=enselm=01091023443406.73075%40prime.vsservices.comrnum=7


All comments and suggestions appreciated !

--PT


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



cryptography implications (privacy) of FreeBSD jail ?

2002-03-11 Thread Patrick Thomas


Let's say I am running in a jail, and say 5 other people are running in
other, seperate jails on the same machine.

Now lets say I start up pgp, and generate my keys, and generally use pgp
through the command line in my jail.  Or, instead of pgp I do other crypto
related sensitive activities...

what is my risk here ?  Can someone either on the host machine or in one
of the other jails watch memory on the machine and discern things like my
keys or passphrases or have very easy access to the data I am decrypting ?

Please feel free to expand on the topic as well, in case there are related
questions that I am _not_ asking, but should be...

--pt


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: cannot get more than 32 PTYs in 4.4-RELEASE

2002-03-05 Thread Patrick Thomas


Ok, see the point is, I have _already done this_

 sh MAKEDEV pty0   # 0-31
 sh MAKEDEV pty1   # 32-63
 sh MAKEDEV pty2   # 64-95
 sh MAKEDEV pty3   # 96-127
 sh MAKEDEV pty4   # 128-159 xterm won't recognize by default
 sh MAKEDEV pty5   # 160-191 xterm won't recognize by default
 sh MAKEDEV pty6   # 192-223 xterm won't recognize by default
 sh MAKEDEV pty7   # 224-255 xterm won't recognize by default


These are the exact commands I used with `sh MAKEDEV` to create my 256 pty
/dev entries.

So to recap, all 256 /dev files are there, all 256 entries are in
/etc/ttys (and were there by default) and I have:

maxusers128

and

pseudo-device   pty 128

in my kernel.  And when I create 32 screens with `screen`, nobody else can
login by any method (ssh, telnet, etc.).  (No more PTYs error, etc.)

What am I missing here ?  Please note that this is 4.4-RELEASE - this
doesn't seem to be a problem in 4.5

thanks,

PT


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Four misc. questions related to jail usage

2002-03-05 Thread Patrick Thomas


 Patrick Thomas [EMAIL PROTECTED] writes:
  1. Does each jail need to have its own proc filesystem mounted?

 No, procfs is pretty much useless these days (except for truss).

In 4.5, won't `ps` (and perhaps other apps) not work for people in a jail
if their jail does not have a proc file system mounted in their /proc ?

--pt


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



cannot get more than 32 PTYs in 4.4-RELEASE

2002-03-04 Thread Patrick Thomas


In my kernel, I have:

maxusers128

pseudo-device   pty 128

In my /dev directory, I have used `sh MAKEDEV` to make all 256 /dev/pty
files.  They are all there, and all have correct major/minor numbers.  I
know I won't be using all 256 of them, but I just made them all anyway.

In /etc/ptys, I didn't change anything, because all 256 pty entries are
ALREADY in there:

# Pseudo Terminals
ttyp0   nonenetwork
ttyp1   nonenetwork
...
ttySu   nonenetwork
ttySv   nonenetwork

So those are all there.

I have used `sysctl -a | grep maxuser` to verify that maxusers is indeed
128.

BUT - if I log on via ssh and start screen, and start 31 new screen
windows, then nobody else can log on to the system - I cannot create any
more screen windows AND nobody else can ssh in - the machine has run out
of ptys.

I use `fstat` to inquire, and I am maxed out at exactly 32 ptys.

SO THE question is, why am I stuck at 32 ptys ?  I have done it all -
everything that is in any doc or news post, and everything I was told to
do here and on -hackers, and yet I am still stuck at 32 !!!

Please tell me the secret lore for getting more than 32 ptys in
4.4-RELEASE.


thanks,

PT


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: using vnconfig devices instead of partitions for jails ?

2002-02-28 Thread Patrick Thomas


thank you - I am glad to see that this is a good way of doing things.  Two
quick items:

1. How do I give each jail a 'proc' filesystem in its /proc using this
configuration ?

2. Is there any downside to this whatsoever ?  This seems infinitely
better than a new partition for each jail, so was I just silly for doing
it that way ?

thanks!

On Wed, 27 Feb 2002, Nik Clayton wrote:

 On Wed, Feb 27, 2002 at 03:03:11PM -0600, Kirk Strauser wrote:
  At 2002-02-27T20:49:18Z, Patrick Thomas [EMAIL PROTECTED] writes:
   I would like to put a large number of jails (16 or 20) on a server for
   testing purposes.
  
   I have two options so far: create 16 or 20 partitions OR just put them all
   in one partition, but the downside of that is that then I cannot enforce
   disk usage between jails.  So at this point, 16-20 partitions seems the
   safest route.
 
  Good question.  Is there any ability at all within the system to set a quota
  on a jail?

 Each vn* device has to be baced by a physical file on the system.
 Simply make sure that this physical device is the maximum size you want
 to allow in the jail.

 For example, on a server with 160GB of (RAID) disk, and 12 jails, each 10GB
 in size, I just have 12 jails;

 On the 'master' host for the jails.

 # cd /usr/local/jails/disk-images
 # ls -l
 totall 1758115
 drwxr-xr-x  2 root  wheel  512 Jan 23 00:40 .
 drwxr-xr-x  4 root  wheel  512 Jan 23 00:39 ..
 -rw-r--r--  1 root  wheel  136 Jan 22 18:45 README
 -rw-r--r--  1 root  wheel  10737418240 Feb 27 23:35 foo.com.vn
 -rw-r--r--  1 root  wheel  10737418240 Feb 27 23:35 bar.com.vn
 -rw-r--r--  1 root  wheel  10737418240 Feb 27 23:35 baz.com.vn
 ...

 These were created with truncate 10G file, and are then mounted
 configued on different vn* devices, which are then mounted as normal.

 # mount
 ...
 /dev/vn0a on /usr/local/jails/foo.com
 /dev/vn1a on /usr/local/jails/bar.com
 /dev/vn2a on /usr/local/jails/baz.com
 ...

 N
 --
 FreeBSD: The Power to Serve  http://www.freebsd.org/   (__)
 FreeBSD Documentation Projecthttp://www.freebsd.org/docproj/\\\'',)
   \/  \ ^
--- 15B8 3FFC DDB4 34B0 AA5F  94B7 93A8 0764 2C37 E375 --- .\._/_)



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: using vnconfig devices instead of partitions for jails ?

2002-02-28 Thread Patrick Thomas


one other thing:

How many mount points (jails, in this case) can I run ?  I see that there
are 8 existing vn0X device files in /dev - can I just create more of them
using MAKEDEV (or mknod) and keep going ?

What is the maximum ?  256 ?

also, do I need to alter the kernel to support more vn0X device files, or
does a stock kernel support all the way up to the maximum (whatever that
is - see previous question :)

thanks again - much appreciated.

--pt


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



using vnconfig devices instead of partitions for jails ?

2002-02-27 Thread Patrick Thomas


I would like to put a large number of jails (16 or 20) on a server for
testing purposes.

I have two options so far: create 16 or 20 partitions OR just put them all
in one partition, but the downside of that is that then I cannot enforce
disk usage between jails.  So at this point, 16-20 partitions seems the
safest route.

But, what about using vnconfig to create files of fixed sizes and then
mounting them?

Is this reasonable ?

Is there a limit to how many vnconfig files I can mount as filesystems ?

Is there a way to mount a directory _inside_ a vnconfig mount as a 'proc'
filesystem (since the jail needs proc in order for ps, etc., to work?)

Any comments about this idea in general are appreciated.

--PT


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message