Re: IBM blade server abysmal disk write performances

2013-01-19 Thread Don Lewis
On 19 Jan, Stefan Esser wrote:
 
> I seem to remember, that drives of that time required the write cache
> to be enabled to get any speed-up from tagged commands. This was no
> risk with SCSI drives, since the cache did not make the drives lye
> about command completion (i.e. the status for the write was only
> returned when the cached data had been written to disk, independently
> of the write cache enable).

For a very long time, all of the SCSI drives that I have purchased have
come with the WCE bit turned on.  I always had to remember to use
camcontrol to turn it off.  When I last benchmarked it quite a few years
ago, buildworld times were about the same with either setting, and my
filesystems were a lot safer with WCE off, which UFS+SU depends on. I've
also seen drives dynamically drop the number of supported tags WCE was
on and the write cache started getting full, which made CAM unhappy.

I've been using SCSI for anything important for all these years except
on my laptop.  I haven't yet switched to SATA because I haven't put
together a new system since NCQ support made it into -STABLE.  The hard
drives in my -CURRENT machine are cast-offs from my primary machine.
Just doin' my part to make sure legacy support isn't broken ...


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: IBM blade server abysmal disk write performances

2013-01-19 Thread Don Lewis
On 18 Jan, Wojciech Puchar wrote:

> If computer have UPS then write caching is fine. even if FreeBSD crash, 
> disk would write data

I've had my share of sudden UPS failures over the years.  Probably more
than half have been during an automatic battery self test.  UPS goes on
battery, and then *boom*, everything shuts down.  At that point the UPS
helpfully indicates that the battery needs to be replaced.  This seems
to happen more frequently once the batteries get to be about 4 years
old.  I've started replacing them after 3 years.

My next big build will have redundant PSUs, each connected to a separate
UPS.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Pull in upstream before 9.1 code freeze?

2012-07-05 Thread Don Lewis
On  5 Jul, Olivier Smedts wrote:

> an Ubuntu "server" :
> # time fsqfqsdfs
> fsqfqsdfs: command not found
> 
> real0m0.408s
> user0m0.120s
> sys 0m0.040s
> 
> and that's a *fast* one !

Lucky you!

Fedora 16 on my fastest hardware ...

# time fsqfqsdfs
bash: fsqfqsdfs: command not found...

real0m2.110s
user0m0.018s
sys 0m0.010s

... makes typos very annoying.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: FreeBSD has serious problems with focus, longevity, and lifecycle

2012-01-24 Thread Don Lewis
On 17 Jan, Atom Smasher wrote:
> thanks john.
> 
> i've been a long-time (10+ years) freeBSD user (desktops, laptops, 
> servers, and anywhere else i can run it) and advocate (encouraging others 
> to at least check it out) and also a long-time satisfied johnco customer.
> 
> my freeBSD days seem to be coming to an end.
> 
> i bought myself a LENOVO T510 when it first came out, around early 2010. 
> it's got an i5 CPU and Arrandale GPU. it's two years old and on freeBSD i 
> STILL can't run xorg properly with it. linux has run fine with it since i 
> opened the box. last i checked, freeBSD will be support this GPU in R9... 
> or maybe R10...?
> 
> i really like freeBSD's robustness, especially compared to linux, among 
> other things. i like that freeBSD is genetically a "real" unix... what's 
> the real difference between BSD and linux? BSD was developed by unix 
> hackers porting the OS to PC hardware; linux was developed by PC hackers 
> trying to make their own version of unix. these origins are still very 
> apparent, if one knows where to look.
> 
> i like that i can set up a freeBSD bare-bones (eg secure) mail-server or 
> web-server in an afternoon.
> 
> but none of that matters if the damn thing just doesn't work.
> 
> over the last two years, and it pains me to say this, i've been running 
> linux on that T510. but it gets worse... i've been finding that i'm simply 
> more productive on that machine, and spending more time in front of it, 
> and more time getting useful things done.
> 
> i understand that it's ultimately a matter of manpower and resources, and 
> linux seems to have more momentum and "sex appeal", but i'm finding myself 
> in a real crisis of faith... the OS that i've been using and loving for 
> 10+ years seems to be dying, from any real usability perspective.
> 
> and for now, i'm slowly and reluctantly migrating towards linux.

My experience has been pretty much the opposite.

I've been using FreeBSD since 2.1 both as part of my job and also for
personal use.  I've been using Linux for work for about the last ten
years.

I'm currently running FreeBSD 8.2-STABLE for my personal computing
needs.  My experience is that the base system has been very solid and
the only problems that I run into have been with the ports.  The last
major problem that I had with the base system was USB printer support in
7.x, which was incomplete and flakey and then deteriorated to the point
of being unusable.  This motivated me to migrate to 8 which has a
rewritten USB implementation.

At work, our major software vendor only supports their software on Red
Hat Enterprise Linux and SUSE Linux Enterprise.  Since our budget is
tight, we run CentOS, which is essentially a repackaged version of RHEL.
We've been running CentOS 4.x, but are switching to CentOS 5.x now that
4.x has been EOLed.

My experience with CentOS is that new features and support for recent
hardware lags quite a bit.  A few years ago I had a motherboard that I
liked a lot that ran Fedora just fine, but CentOS lacked support for. It
took a very long time before CentOS supported NFS over TCP.  This was
especially painful because we rely heavily on NFS across a WAN to
support access to the same data from diferent sites.  In addition our
WAN is implemented using IPSEC tunnels, which have a smaller MTU, and
Linux doesn't support manually setting the MTU on a per-route basis (and
I wasn't able to get PMUTD to work).  What I observed was that the NFS
packets would first be fragmented to the default 1500 byte MTU and then
the firewall would fragment each of those packets into one large packet
followed by a tiny one.  This, along with the lack of TCP's congestion
control, was not beneficial to NFS performance ...

Bugs are also slow to get fixed.  For a very long time I had problems
with gam_server running away.  I'd frequently start top and see
gam_server pegged at 100% CPU, stealing time from my simulations.  If I
killed it, it would get respawed, and would then behave itself for a few
days before running away again.  This bug eventually got fixed, but
there's a kwin bug in CentOS 4 that still hasn't been fixed.  Every now
an then, kwin will stop working and I can no longer move windows around
my desktop.  I'm pretty sure this is fixed in a newer version of KDE,
but RHEL/CentOS tend to stick with one major version of their "ports"
forever, so I probably couldn't expect to see this fixed until I upgrade
to the next version of CentOS.  Things might be different if I was a
paying RHEL customer and was able to motivate Red Hat into back porting
patches for particular bugs.

Another "feature" that I get to enjoy on a daily basis is that the
kernel in CentOS 4 does not like my KVM switch.  When switching to my
CentOS machine, I have about a 50% chance that any mouse movement will
cause the cursor to fly all over the screen and spew random mouse clicks
all over my desktop.  This typically causes a bunch of windows to pop
up, and it also usually causes

Re: Per-mount syncer threads and fanout for pagedaemon cleaning

2011-12-26 Thread Don Lewis
On 26 Dec, Venkatesh Srinivas wrote:
> Hi!
> 
> I've been playing with two things in DragonFly that might be of interest here.
> 
> Thing #1 :=
> 
> First, per-mountpoint syncer threads. Currently there is a single thread,
> 'syncer', which periodically calls fsync() on dirty vnodes from every mount,
> along with calling vfs_sync() on each filesystem itself (via syncer vnodes).
> 
> My patch modifies this to create syncer threads for mounts that request it.
> For these mounts, vnodes are synced from their mount-specific thread rather
> than the global syncer.
> 
> The idea is that periodic fsync/sync operations from one filesystem should not
> stall or delay synchronization for other ones. 
> 
> The patch was fairly simple:
> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/50e4012a4b55e1efc595db0db397b4365f08b640
> 
> There's certainly more that could be done in this direction -- the current 
> patch
> does preserve a global syncer ('syncer0') for unflagged filesystems and for
> running the rushjobs logic from speedup_syncer. And the current patch 
> preserves
> the notion of syncer vnodes, which are entirely overkill when there are 
> per-mount sync threads. But its a start and something very similar could apply
> to FreeBSD.

I used to think that something like this was a good idea, but the first
issue that I thought of was that this could cause excessive seeking if
multiple threads were attempting to sync vnodes for multiple partitions
on the same physical device at the same time.

What might be better is one thread per physical device, possibly doing a
simple elevator sort based on partition number and inode number on the
items in each worklist bucket.  It might still be possible to retain the
advantages of this with a one thread per mount point implementation by
adding interlocks (or even just start time offsets) so that they don't
all try to run at once and fight over the head actuator.  Implementing
one thread per mount point does have the advantage of making it easy to
observe which mount points are "busy".

The next complication is that all of the different ways that we have to
slice, dice, and combine storage devices (various forms of RAID, ZFS
pools, etc.) make the concept of a device a lot more complicated.  How
do we optimize?  Should we even try?

One of the things that you didn't mention about syncer vnodes is all of
the nastyness that goes on inside ffs_sync() and friends every time the
syncer gets to the syncer vnode.  That causes a big burst of I/O every
30 seconds.


> Thing #2 :=
> 
> Currently when pagedaemon decides to launder a dirty page, it initiates I/O
> for the launder from within its own thread context. While the I/O is generally
> asynchronous, the call path to get there from pagedaemon is deep and fraught
> with stall points: (for vnode_pager, possible stalls annotated)
> 
>   pagedaemon scans ->
>   ...
>   vm_pageout_clean -> 
> [block on vm_object locks,
>   
>  page busy]
>   vm_pageout_flush ->
>   vnode_pager_putpages ->
>   vnode_generic_putpages ->
>   _write ->   
> [block on FS locks]
>   b(,a,d)write -> 
> [wait on runningbufspace]
>   
> _stratgy ->
>   
> Oh my...
> 
> While any part of this path is stalled, pagedaemon is not continuing to do its
> job; this could be a problem -- so long as it is not laundering pages, we are
> not resolving any page shortages.
> 
> Given Thing #1, we have per-mountpoint service threads; I think it'd be worth
> pushing out the deeper parts of this callpath into those threads. The idea is
> that pagedaemon would select and cluster pages as it does now, but would use
> the syncer threads to walk through the pager and FS layer. An added benefit
> of using the syncer threads is that contention between fsync/vfs_sync on an
> FS and pageout on that same FS would be excluded. The pagedaemon would not 
> wait for the I/O to initiate before continuing to scan more candidates.
> 
> I've not found an ideal place to break up this callchain, but either between
> vm_pageout_clean / vm_pageout_flush, or at the entry to the vnode_pager would
> be good places. In experiments, I've sent the vm_pageout_flush calls off to
> a convenient taskqueue, seems to work okay. But sending them to per-mount
> threads would be better.

The current implementation definitely has the flaws that you mention.  I
remember system deadlocks in years past.  Your idea for a fix looks
interesting.

_

Re: 8.1-RELEASE hangs on reboot

2010-12-01 Thread Don Lewis
On  1 Dec, Ondřej Majerech wrote:
> Hello,
> 
> my 8.1-R system has just started hanging on reboot. Specifically after
> I svn up'd my source and updated from 8.1-R-p1 to -p2.
> 
> Some kind of hang occurs on every reboot attempt. Usually it hangs at
> the "Rebooting..." message, but sometimes the thing just locks up
> before it even syncs disks. shutdown -p now seems to shutdown the
> system successfully each time.

One of my systems running 8.1-STABLE started reliably(?) hanging at the
"Rebooting..." step whenever try to reboot it.  It's been doing this for
the last month or so.  I haven't seen the earlier hang.  9.0-CURRENT on
the same hardware doesn't experience this problem.

I haven't had time to try to debug this, so I've just been using the
reset switch when it hangs.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: fsck of large volume with small memory

2007-09-25 Thread Don Lewis
On 25 Sep, sam wrote:
> sam wrote:
>> Don Lewis wrote:
>>> On 24 Sep, sam wrote:
>>>   Expect major file system lossage ...
>>> I think this patch could be better, but this should get you going ...
>>>
>>>
>>
>>
>> UNEXPECTED SOFT UPDATE INCONSISTENCY
>> LOST 74 DIRECTORIES
>>
>> UNEXPECTED SOFT UPDATE INCONSISTENCY
>> fsck: /dev/aacd0s1f: Segmentation fault: 11

It would be good to know the cause of this segfault so that the code
could be fixed to prevent it.

>> #
>> ==
>>
>> /Vladimir Ermakov
>>
>>
> # cat /etc/rc.conf |grep fsck
> fsck_y_enable="YES"
> background_fsck="NO"
> 
> hm, and after system reboot
> 
> =
> # fsck /dev/aacd0s1f
> ** /dev/aacd0s1f (NO WRITE)
> ** Last Mounted on /usr
> ** Phase 1 - Check Blocks and Sizes
> ** Phase 2 - Check Pathnames
> ** Phase 3 - Check Connectivity
> ** Phase 4 - Check Reference Counts
> ** Phase 5 - Check Cyl groups
> 438959 files, 3567329 used, 69528964 free (94684 frags, 8679285 blocks, 
> 0.1% fragmentation)
> #
> =

That's the fastest way of getting the file system back into a consistent
state (or you could run "fsck -y" in single-user mode), but it increases
the probability of data loss. The problem is that my patch allows fsck
to examine a bunch of uninitialized inodes in the damaged cylinder
group, and there is the possibility that one or more of these inodes
could look reasonably valid and contain block pointers that point to
blocks in valid files.  Fsck will then detect the duplicate block
pointers and clear the inodes for both files. It would be nice if fsck
could be told to put less trust in the inodes that might not actually be
initialized, but this gets complicated ...

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: fsck of large volume with small memory

2007-09-25 Thread Don Lewis
On 25 Sep, sam wrote:
> Don Lewis wrote:
>> On 24 Sep, sam wrote:
>>   
>>
>>> any solutions ?
>>> 
>>
>> The patch below should allow a manual fsck to run to completion. I'd
>> recommend running "fsck -N" and capturing its output.  Then use the clri
>>   
> # fsck -N
> fsck: illegal option -- N
> usage: fsck [-dfnpvy] [-B | -F] [-T fstype:fsoptions] [-t fstype] 
> [special | node] ...

Sorry, it should be "fsck -n".

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: fsck of large volume with small memory

2007-09-24 Thread Don Lewis
On 24 Sep, sam wrote:
> hi, all
> http://lists.freebsd.org/pipermail/freebsd-questions/2007-June/151686.html
> 
> my problem
> # fsck /dev/aacd0s1f
> ** /dev/aacd0s1f (NO WRITE)
> ** Last Mounted on /usr
> ** Phase 1 - Check Blocks and Sizes
> fsck_ufs: cannot alloc 2378019004 bytes for inoinfo

I'd be willing to bet that one of the cylinder group blocks in your file
system got corrupted.
 
> any solutions ?

The patch below should allow a manual fsck to run to completion. I'd
recommend running "fsck -N" and capturing its output.  Then use the clri
command (either standalone or in fsdb) to zero out the uninitialized
inodes that are unmasked by setting cg_initediblk to its maximum
possible value based on the file system parameters.

Expect major file system lossage ...

I think this patch could be better, but this should get you going ...

Index: sbin/fsck_ffs/pass1.c
===
RCS file: /home/ncvs/src/sbin/fsck_ffs/pass1.c,v
retrieving revision 1.43
diff -u -r1.43 pass1.c
--- sbin/fsck_ffs/pass1.c   8 Oct 2004 20:44:47 -   1.43
+++ sbin/fsck_ffs/pass1.c   24 Sep 2007 23:15:22 -
@@ -93,9 +93,29 @@
inumber = c * sblock.fs_ipg;
setinodebuf(inumber);
getblk(&cgblk, cgtod(&sblock, c), sblock.fs_cgsize);
-   if (sblock.fs_magic == FS_UFS2_MAGIC)
+   if (sblock.fs_magic == FS_UFS2_MAGIC) {
inosused = cgrp.cg_initediblk;
-   else
+if (inosused < 0 || inosused > sblock.fs_ipg) {
+   pfatal("CG %d: PREPOSTEROUS NUMBER OF INODES %d 
(cg_initediblk), ASSUMING %d (fs_ipg)\n",
+   c, inosused, sblock.fs_ipg);
+   /*
+* The cylinder group block is most likely
+* totally corrupted and will probably
+* fail the magic number check below as well.
+* Ignoring cg_initediblk and setting
+* inosused to sblock.fs_ipg will allow
+* a manual fsck to proceed further instead
+* of dying when it attempts to allocate
+* an insane amount of memory to store
+* the inode info for this cylinder group.
+* This may provide enough information
+* to allow the system administrator to
+* do a better job of patching the
+* filesystem with fsdb.
+*/
+   inosused = sblock.fs_ipg;
+   }
+   } else
inosused = sblock.fs_ipg;
if (got_siginfo) {
printf("%s: phase 1: cyl group %d of %d (%d%%)\n",

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Problems with rpc.statd and PAE

2007-08-02 Thread Don Lewis
On 31 Jul, João Carlos Mendes Luís wrote:
> Hi,
> 
> Sent this to -questions, but got no answer.  Now I'll try -hackers...
> 
> I've just configured my first server with 4G RAM.  To use it, I had
> to select PAE in kernel config.  I was a little bit troubled by it's
> advice not to use modules (is it that critical?), but got it running.
> 
> But when it is running on PAE, NFS statd refuses to run:
> 
> # /etc/rc.d/nfslocking start
> Starting statd.
> rpc.statd: unable to mmap() status file: Cannot allocate memory
> Segmentation fault
> #
> 
> Using strace I found it was trying to mmap the status file, at
> /var/db/statd.status:
> 
> open("/var/db/statd.status", O_RDWR)= 10
> mmap(0, 268435456, PROT_READ|PROT_WRITE, MAP_SHARED, 10, 0) = -1 ENOMEM
> (Cannot allocate memory)
> 
> It's really strange to have mmap len = 256M, specially because the
> file is always small.  But it works without PAE, and do not work with
> PAE.  And it is described in the handbook:
> 
> http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/book.html#STATD-MEM-LEAK

I've been seeing this same problem for a long time on an 7.0-CURRENT
i386 machine with 1GB of RAM, and I'm not using PAE.  I haven't
discovered any obvious cause for the problem.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: freebsd-5.4-stable panics

2005-10-12 Thread Don Lewis
On 12 Oct, Rob Watt wrote:
>> >> On Fri, 7 Oct 2005, Don Lewis wrote:
>> I MFC'ed the fix to RELENG_6 last week, but the patch didn't apply
>> cleanly to RELENG_5.  I tweaked the patch for RELENG_5 and tested it on
>> a UP box.  I'd like to get some testing on SMP hardware before I commit
>> it to RELENG_5, just to make sure that I don't destabilize -STABLE.  I
>> do want to get the fix into RELENG_5, since this thread originated with
>> a complaint about 5.4-STABLE.
> 
> I should be able to have a 5.4 machine available to test this tonight. Can
> you send me the tweaked patch?

I found a couple little nits that I fixed in this version:

Index: sys/kern/kern_proc.c
===
RCS file: /home/ncvs/src/sys/kern/kern_proc.c,v
retrieving revision 1.215.2.6
diff -u -r1.215.2.6 kern_proc.c
--- sys/kern/kern_proc.c22 Mar 2005 13:40:23 -  1.215.2.6
+++ sys/kern/kern_proc.c12 Oct 2005 19:13:14 -
@@ -72,6 +72,8 @@
 
 static void doenterpgrp(struct proc *, struct pgrp *);
 static void orphanpg(struct pgrp *pg);
+static void fill_kinfo_proc_only(struct proc *p, struct kinfo_proc *kp);
+static void fill_kinfo_thread(struct thread *td, struct kinfo_proc *kp);
 static void pgadjustjobc(struct pgrp *pgrp, int entering);
 static void pgdelete(struct pgrp *);
 static int proc_ctor(void *mem, int size, void *arg, int flags);
@@ -601,33 +603,22 @@
}
 }
 #endif /* DDB */
-void
-fill_kinfo_thread(struct thread *td, struct kinfo_proc *kp);
 
 /*
- * Fill in a kinfo_proc structure for the specified process.
+ * Clear kinfo_proc and fill in any information that is common
+ * to all threads in the process.
  * Must be called with the target process locked.
  */
-void
-fill_kinfo_proc(struct proc *p, struct kinfo_proc *kp)
-{
-   fill_kinfo_thread(FIRST_THREAD_IN_PROC(p), kp);
-}
-
-void
-fill_kinfo_thread(struct thread *td, struct kinfo_proc *kp)
+static void
+fill_kinfo_proc_only(struct proc *p, struct kinfo_proc *kp)
 {
-   struct proc *p;
struct thread *td0;
-   struct ksegrp *kg;
struct tty *tp;
struct session *sp;
struct timeval tv;
struct ucred *cred;
struct sigacts *ps;
 
-   p = td->td_proc;
-
bzero(kp, sizeof(*kp));
 
kp->ki_structsize = sizeof(*kp);
@@ -685,7 +676,8 @@
kp->ki_tsize = vm->vm_tsize;
kp->ki_dsize = vm->vm_dsize;
kp->ki_ssize = vm->vm_ssize;
-   }
+   } else if (p->p_state == PRS_ZOMBIE)
+   kp->ki_stat = SZOMB;
if ((p->p_sflag & PS_INMEM) && p->p_stats) {
kp->ki_start = p->p_stats->p_start;
timevaladd(&kp->ki_start, &boottime);
@@ -704,71 +696,6 @@
kp->ki_nice = p->p_nice;
bintime2timeval(&p->p_runtime, &tv);
kp->ki_runtime = tv.tv_sec * (u_int64_t)100 + tv.tv_usec;
-   if (p->p_state != PRS_ZOMBIE) {
-#if 0
-   if (td == NULL) {
-   /* XXXKSE: This should never happen. */
-   printf("fill_kinfo_proc(): pid %d has no threads!\n",
-   p->p_pid);
-   mtx_unlock_spin(&sched_lock);
-   return;
-   }
-#endif
-   if (td->td_wmesg != NULL) {
-   strlcpy(kp->ki_wmesg, td->td_wmesg,
-   sizeof(kp->ki_wmesg));
-   }
-   if (TD_ON_LOCK(td)) {
-   kp->ki_kiflag |= KI_LOCKBLOCK;
-   strlcpy(kp->ki_lockname, td->td_lockname,
-   sizeof(kp->ki_lockname));
-   }
-
-   if (p->p_state == PRS_NORMAL) { /*  XXXKSE very approximate */
-   if (TD_ON_RUNQ(td) ||
-   TD_CAN_RUN(td) ||
-   TD_IS_RUNNING(td)) {
-   kp->ki_stat = SRUN;
-   } else if (P_SHOULDSTOP(p)) {
-   kp->ki_stat = SSTOP;
-   } else if (TD_IS_SLEEPING(td)) {
-   kp->ki_stat = SSLEEP;
-   } else if (TD_ON_LOCK(td)) {
-   kp->ki_stat = SLOCK;
-   } else {
-   kp->ki_stat = SWAIT;
-   }
-   } else {
-   kp->ki_stat = SIDL;
-   }
-
-   kg = td->td_ksegrp;
-
-   /* things in the KSE GROUP */
-   kp->ki_estcpu = kg->kg_estcpu;
-   kp->ki_slptime = kg->kg_slptime;
-   kp->ki_pri.pri_user = kg->kg_user_pri;
-  

Re: freebsd-5.4-stable panics

2005-10-11 Thread Don Lewis
On 11 Oct, Rob Watt wrote:
> On Mon, 10 Oct 2005, Rob Watt wrote:
> 
>> Don,
>>
>> On Fri, 7 Oct 2005, Don Lewis wrote:
>>
>> > Both HEAD and RELENG_6 have been patched.  I've tested the following
>> > patch for RELENG_5 on a uniprocessor sparc64 box.  I'd appreciate it if
>> > anyone who was running into this problem on RELENG_5 with SMP hardare
>> > could test it before I do the MFC.
>>
>> We have a machine running with those patches applied. We need to do some
>> other tests on it today, but tonight we will run our threaded applications
>> that trigger the kern_proc problem in top. We should have results tomorrow
>> morning.
> 
> Don,
> 
> I had misunderstood what you had asked. I tested this on a 6.0 machine. I
> could not crash an amd64 SMP box running 6.0-BETA5 with this patch. I do
> not have a test box running RELENG_5 to try this patch on right now. If I
> can setup a test box I will let you know our results, but that may take a
> day or two.

I MFC'ed the fix to RELENG_6 last week, but the patch didn't apply
cleanly to RELENG_5.  I tweaked the patch for RELENG_5 and tested it on
a UP box.  I'd like to get some testing on SMP hardware before I commit
it to RELENG_5, just to make sure that I don't destabilize -STABLE.  I
do want to get the fix into RELENG_5, since this thread originated with
a complaint about 5.4-STABLE.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: freebsd-5.4-stable panics

2005-10-07 Thread Don Lewis
On  3 Oct, Rob Watt wrote:

> We noticed the patches from Don Lewis, but have not tested them yet. We
> weren't sure if we could just apply those patches against 6.0-BETA5, or
> whether we should wait for them to be MFC'd.

Both HEAD and RELENG_6 have been patched.  I've tested the following
patch for RELENG_5 on a uniprocessor sparc64 box.  I'd appreciate it if
anyone who was running into this problem on RELENG_5 with SMP hardare
could test it before I do the MFC.


Index: sys/kern/kern_proc.c
===
RCS file: /home/ncvs/src/sys/kern/kern_proc.c,v
retrieving revision 1.215.2.6
diff -u -r1.215.2.6 kern_proc.c
--- sys/kern/kern_proc.c22 Mar 2005 13:40:23 -  1.215.2.6
+++ sys/kern/kern_proc.c7 Oct 2005 23:17:26 -
@@ -72,6 +72,8 @@
 
 static void doenterpgrp(struct proc *, struct pgrp *);
 static void orphanpg(struct pgrp *pg);
+static void fill_kinfo_proc_only(struct proc *p, struct kinfo_proc *kp);
+static void fill_kinfo_thread(struct thread *td, struct kinfo_proc *kp);
 static void pgadjustjobc(struct pgrp *pgrp, int entering);
 static void pgdelete(struct pgrp *);
 static int proc_ctor(void *mem, int size, void *arg, int flags);
@@ -601,33 +603,22 @@
}
 }
 #endif /* DDB */
-void
-fill_kinfo_thread(struct thread *td, struct kinfo_proc *kp);
 
 /*
- * Fill in a kinfo_proc structure for the specified process.
+ * Clear kinfo_proc and fill in any information that is common
+ * to all threads in the process.
  * Must be called with the target process locked.
  */
-void
-fill_kinfo_proc(struct proc *p, struct kinfo_proc *kp)
-{
-   fill_kinfo_thread(FIRST_THREAD_IN_PROC(p), kp);
-}
-
-void
-fill_kinfo_thread(struct thread *td, struct kinfo_proc *kp)
+static void
+fill_kinfo_proc_only(struct proc *p, struct kinfo_proc *kp)
 {
-   struct proc *p;
struct thread *td0;
-   struct ksegrp *kg;
struct tty *tp;
struct session *sp;
struct timeval tv;
struct ucred *cred;
struct sigacts *ps;
 
-   p = td->td_proc;
-
bzero(kp, sizeof(*kp));
 
kp->ki_structsize = sizeof(*kp);
@@ -685,7 +676,8 @@
kp->ki_tsize = vm->vm_tsize;
kp->ki_dsize = vm->vm_dsize;
kp->ki_ssize = vm->vm_ssize;
-   }
+   } else if (p->p_state != PRS_ZOMBIE)
+   kp->ki_stat = SZOMB;
if ((p->p_sflag & PS_INMEM) && p->p_stats) {
kp->ki_start = p->p_stats->p_start;
timevaladd(&kp->ki_start, &boottime);
@@ -704,71 +696,6 @@
kp->ki_nice = p->p_nice;
bintime2timeval(&p->p_runtime, &tv);
kp->ki_runtime = tv.tv_sec * (u_int64_t)100 + tv.tv_usec;
-   if (p->p_state != PRS_ZOMBIE) {
-#if 0
-   if (td == NULL) {
-   /* XXXKSE: This should never happen. */
-   printf("fill_kinfo_proc(): pid %d has no threads!\n",
-   p->p_pid);
-   mtx_unlock_spin(&sched_lock);
-   return;
-   }
-#endif
-   if (td->td_wmesg != NULL) {
-   strlcpy(kp->ki_wmesg, td->td_wmesg,
-   sizeof(kp->ki_wmesg));
-   }
-   if (TD_ON_LOCK(td)) {
-   kp->ki_kiflag |= KI_LOCKBLOCK;
-   strlcpy(kp->ki_lockname, td->td_lockname,
-   sizeof(kp->ki_lockname));
-   }
-
-   if (p->p_state == PRS_NORMAL) { /*  XXXKSE very approximate */
-   if (TD_ON_RUNQ(td) ||
-   TD_CAN_RUN(td) ||
-   TD_IS_RUNNING(td)) {
-   kp->ki_stat = SRUN;
-   } else if (P_SHOULDSTOP(p)) {
-   kp->ki_stat = SSTOP;
-   } else if (TD_IS_SLEEPING(td)) {
-   kp->ki_stat = SSLEEP;
-   } else if (TD_ON_LOCK(td)) {
-   kp->ki_stat = SLOCK;
-   } else {
-   kp->ki_stat = SWAIT;
-   }
-   } else {
-   kp->ki_stat = SIDL;
-   }
-
-   kg = td->td_ksegrp;
-
-   /* things in the KSE GROUP */
-   kp->ki_estcpu = kg->kg_estcpu;
-   kp->ki_slptime = kg->kg_slptime;
-   kp->ki_pri.pri_user = kg->kg_user_pri;
-   kp->ki_pri.pri_class = kg->kg_pri_class;
-
-   /* Things in the thread */
-   kp->ki_wchan = td->td_wchan;
-   kp->ki_pri.pri_level = td->td_priority;
-   

Re: freebsd-5.4-stable panics

2005-10-04 Thread Don Lewis
On  3 Oct, Rob Watt wrote:
>> It turns out that the sysctl buffer is already wired in one of the two
>> cases
>> that this function is called, so I moved the wiring up to the upper
> layer
>> in
>> the other case and cut out a bunch of the locking gymnastics as a
> result.
>> Can you try this patch?
>>
>> Index: kern_proc.c
>> ===
>> RCS file: /usr/cvs/src/sys/kern/kern_proc.c,v
>> retrieving revision 1.231
>> diff -u -r1.231 kern_proc.c
>> --- kern_proc.c 27 Sep 2005 18:03:15 - 1.231
>> +++ kern_proc.c 30 Sep 2005 17:04:57 -
>> @@ -875,22 +875,16 @@
>>
>> if (flags & KERN_PROC_NOTHREADS) {
>> fill_kinfo_proc(p, &kinfo_proc);
>> - PROC_UNLOCK(p);
>> error = SYSCTL_OUT(req, (caddr_t)&kinfo_proc,
>> sizeof(kinfo_proc));
>> - PROC_LOCK(p);
>> } else {
>> - _PHOLD(p);
>> FOREACH_THREAD_IN_PROC(p, td) {
>> fill_kinfo_thread(td, &kinfo_proc);
>> - PROC_UNLOCK(p);
>> error = SYSCTL_OUT(req, (caddr_t)&kinfo_proc,
>> sizeof(kinfo_proc));
>> - PROC_LOCK(p);
>> if (error)
>> break;
>> }
>> - _PRELE(p);
>> }
>> PROC_UNLOCK(p);
>> if (error)
>> @@ -932,6 +926,9 @@
>> if (oid_number == KERN_PROC_PID) {
>> if (namelen != 1)
>> return (EINVAL);
>> + error = sysctl_wire_old_buffer(req, 0);
>> + if (error)
>> + return (error);
>> p = pfind((pid_t)name[0]);
>> if (!p)
>> return (ESRCH);
> 
> John,
> 
> We tried this patch and were able to run our simulations (and top) for 3
> days straight without crashing. Since we were panicking every 3-6 hours
> before when running top, this seems to have fixed the problem.
> 
> We noticed the patches from Don Lewis, but have not tested them yet. We
> weren't sure if we could just apply those patches against 6.0-BETA5, or
> whether we should wait for them to be MFC'd.

I haven't tried applying my patch to RELENG_5 yet, but hope to do so in
the next few days in preparation for doing a MFC.  If any changes are
required, I can send you a copy of the patch.


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: freebsd-5.4-stable panics

2005-10-02 Thread Don Lewis
On  2 Oct, Don Lewis wrote:

> It turns out that fill_kinfo_thread() grabs a bunch of locks to grab
> things out of struct proc, which breaks badly if sched_lock is grabbed
> before calling fill_kinfo_thread().
> 
> I refactored fill_kinfo_thread() into two functions, one of which
> doesn't need any additional locks and only gathers per-thread data, and
> a new function, fill_kinfo_proc_only(), which gathers the data that is
> common to all theads and can be called before grabbing sched_lock.  This
> should be more efficient if there is more than one thread because the
> per-process data is only gathered once, and only the per-thread data in
> kinfo_proc is overwritten for each thread.

[ snip ]

After fixing a few whitespace nits and one minor buglet, I commited my
patch to HEAD, in kern_proc.c 1.232.  I hope to be able to MFC it soon.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: freebsd-5.4-stable panics

2005-10-02 Thread Don Lewis
On  1 Oct, Don Lewis wrote:
> On 30 Sep, John Baldwin wrote:

>> It turns out that the sysctl buffer is already wired in one of the two cases 
>> that this function is called, so I moved the wiring up to the upper layer in 
>> the other case and cut out a bunch of the locking gymnastics as a result.  

>> 
>> Index: kern_proc.c
>> ===
>> RCS file: /usr/cvs/src/sys/kern/kern_proc.c,v
>> retrieving revision 1.231
>> diff -u -r1.231 kern_proc.c
>> --- kern_proc.c  27 Sep 2005 18:03:15 -  1.231
>> +++ kern_proc.c  30 Sep 2005 17:04:57 -
>> @@ -875,22 +875,16 @@
>>  
>>  if (flags & KERN_PROC_NOTHREADS) {
>>  fill_kinfo_proc(p, &kinfo_proc);
>> -PROC_UNLOCK(p);
>>  error = SYSCTL_OUT(req, (caddr_t)&kinfo_proc,
>> sizeof(kinfo_proc));
>> -PROC_LOCK(p);
>>  } else {
>> -_PHOLD(p);
>>  FOREACH_THREAD_IN_PROC(p, td) {
>>  fill_kinfo_thread(td, &kinfo_proc);
>> -PROC_UNLOCK(p);
>>  error = SYSCTL_OUT(req, (caddr_t)&kinfo_proc,
>> sizeof(kinfo_proc));
>> -PROC_LOCK(p);
>>  if (error)
>>  break;
>>  }
>> -_PRELE(p);
>>  }
>>  PROC_UNLOCK(p);
>>  if (error)
>> @@ -932,6 +926,9 @@
>>  if (oid_number == KERN_PROC_PID) {
>>  if (namelen != 1) 
>>  return (EINVAL);
>> +error = sysctl_wire_old_buffer(req, 0);
>> +if (error)
>> +return (error); 
>>  p = pfind((pid_t)name[0]);
>>  if (!p)
>>  return (ESRCH);
>> 
> 
> sched_lock needs to be grabbed before the FOREACH_THREAD_IN_PROC loop.
> 
> Can _PHOLD()/_PRELE() be dropped?

It turns out that fill_kinfo_thread() grabs a bunch of locks to grab
things out of struct proc, which breaks badly if sched_lock is grabbed
before calling fill_kinfo_thread().

I refactored fill_kinfo_thread() into two functions, one of which
doesn't need any additional locks and only gathers per-thread data, and
a new function, fill_kinfo_proc_only(), which gathers the data that is
common to all theads and can be called before grabbing sched_lock.  This
should be more efficient if there is more than one thread because the
per-process data is only gathered once, and only the per-thread data in
kinfo_proc is overwritten for each thread.

Index: kern_proc.c
===
RCS file: /home/ncvs/src/sys/kern/kern_proc.c,v
retrieving revision 1.231
diff -u -r1.231 kern_proc.c
--- kern_proc.c 27 Sep 2005 18:03:15 -  1.231
+++ kern_proc.c 2 Oct 2005 08:48:56 -
@@ -73,6 +73,8 @@
 
 static void doenterpgrp(struct proc *, struct pgrp *);
 static void orphanpg(struct pgrp *pg);
+static void fill_kinfo_proc_only(struct proc *p, struct kinfo_proc *kp);
+static void fill_kinfo_thread(struct thread *td, struct kinfo_proc *kp);
 static void pgadjustjobc(struct pgrp *pgrp, int entering);
 static void pgdelete(struct pgrp *);
 static int proc_ctor(void *mem, int size, void *arg, int flags);
@@ -596,33 +598,22 @@
}
 }
 #endif /* DDB */
-void
-fill_kinfo_thread(struct thread *td, struct kinfo_proc *kp);
 
 /*
- * Fill in a kinfo_proc structure for the specified process.
+ * Clear kinfo_proc and fill in any information that is common
+ * to all threads in the process.
  * Must be called with the target process locked.
  */
-void
-fill_kinfo_proc(struct proc *p, struct kinfo_proc *kp)
-{
-   fill_kinfo_thread(FIRST_THREAD_IN_PROC(p), kp);
-}
-
-void
-fill_kinfo_thread(struct thread *td, struct kinfo_proc *kp)
+static void
+fill_kinfo_proc_only(struct proc *p, struct kinfo_proc *kp)
 {
-   struct proc *p;
struct thread *td0;
-   struct ksegrp *kg;
struct tty *tp;
struct session *sp;
struct timeval tv;
struct ucred *cred;
struct sigacts *ps;
 
-   p = td->td_proc;
-
bzero(kp, sizeof(*kp));
 
kp->ki_structsize = sizeof(*kp);
@@ -684,78 +675,14 @@
kp->ki_tsize = vm->vm_tsize;
kp->ki_dsize = vm->vm_dsize;
kp->ki_ssize = vm->vm_ssize;
-   }
+   } else if (p->p_state == PRS_ZOMBIE)
+   kp->ki_stat = SZOMB;
kp->ki_sflag = p->p_sflag;
kp->ki_swtime = p->p_swtime;
kp->ki_pid = p->p_pid;
kp->ki_

Re: freebsd-5.4-stable panics

2005-10-01 Thread Don Lewis
On 30 Sep, John Baldwin wrote:
> On Friday 30 September 2005 11:25 am, Antoine Pelisse wrote:
>> On 9/30/05, John Baldwin <[EMAIL PROTECTED]> wrote:
>> > On Friday 30 September 2005 05:24 am, Antoine Pelisse wrote:
>> > > Hi Robert,
>> > > I don't think your patch is correct, the total linked list can be
>> > > broken
>> > >
>> > > while the lock is released, thus just passing the link may not be
>> > > enough I have submitted a PR[1] for this a month ago but nobody took
>> > > care of it yet Regards,
>> > > Antoine Pelisse
>> > >
>> > > [1] http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/84684
>> >
>> > I think this patch looks ok. Robert, can you get the original panic on
>> > this
>> > thread tested against this patch?
>>
>>  I had a small program which could reproduce this panic in 10 seconds, it
>> was basically creating empty threads and calling kvm_getprocs() in the same
>> time. Anyway the patch was able to stop the program from panicing.
>> The panic is also reproducible in RELENG_6 and HEAD IIRC.
> 
> It turns out that the sysctl buffer is already wired in one of the two cases 
> that this function is called, so I moved the wiring up to the upper layer in 
> the other case and cut out a bunch of the locking gymnastics as a result.  
> Can you try this patch?
> 
> Index: kern_proc.c
> ===
> RCS file: /usr/cvs/src/sys/kern/kern_proc.c,v
> retrieving revision 1.231
> diff -u -r1.231 kern_proc.c
> --- kern_proc.c   27 Sep 2005 18:03:15 -  1.231
> +++ kern_proc.c   30 Sep 2005 17:04:57 -
> @@ -875,22 +875,16 @@
>  
>   if (flags & KERN_PROC_NOTHREADS) {
>   fill_kinfo_proc(p, &kinfo_proc);
> - PROC_UNLOCK(p);
>   error = SYSCTL_OUT(req, (caddr_t)&kinfo_proc,
>  sizeof(kinfo_proc));
> - PROC_LOCK(p);
>   } else {
> - _PHOLD(p);
>   FOREACH_THREAD_IN_PROC(p, td) {
>   fill_kinfo_thread(td, &kinfo_proc);
> - PROC_UNLOCK(p);
>   error = SYSCTL_OUT(req, (caddr_t)&kinfo_proc,
>  sizeof(kinfo_proc));
> - PROC_LOCK(p);
>   if (error)
>   break;
>   }
> - _PRELE(p);
>   }
>   PROC_UNLOCK(p);
>   if (error)
> @@ -932,6 +926,9 @@
>   if (oid_number == KERN_PROC_PID) {
>   if (namelen != 1) 
>   return (EINVAL);
> + error = sysctl_wire_old_buffer(req, 0);
> + if (error)
> + return (error); 
>   p = pfind((pid_t)name[0]);
>   if (!p)
>   return (ESRCH);
> 

sched_lock needs to be grabbed before the FOREACH_THREAD_IN_PROC loop.

Can _PHOLD()/_PRELE() be dropped?

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: freebsd-5.4-stable panics

2005-10-01 Thread Don Lewis
On 30 Sep, Antoine Pelisse wrote:
>  Hi Robert,
> I don't think your patch is correct, the total linked list can be broken
> while the lock is released, thus just passing the link may not be enough
> I have submitted a PR[1] for this a month ago but nobody took care of it yet

There are two problems with your patch:

sched_lock needs to be held while iterating over the threads

sysctl_kern_proc() calls sysctl_out_proc() multiple times in a
loop in the !KERN_PROC_PID case, so the buffer needs to be wired
before calling sysctl_out_proc().

Is _PHOLD()/_PRELE() needed if we don't drop PROC_LOCK?

Passing a size estimate to sysctl_wire_old_buffer() is desirable, but
sysctl_out_proc() would need some restructuring to do this correctly.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: journaling fs and large mailbox format

2005-09-29 Thread Don Lewis
On 29 Sep, Doug Barton wrote:
> Mike Meyer wrote:

>> A 4K block won't hold your median file. But an 8K block wastes a lot of 
>> space. You might get a file with 0 blocks and 3 frags, assuming that UFS2
>>  will do that, which doesn't seem good. If UFS2 won't do that, you get a 
>> lot of half-empty blocks, which likewise isn't good. The other option is 
>> a 4K block size, which means you get a lot of 1 block + 1 frag files. 
>> That seems optimal in this case.
> 
> That's a logical analysis, but you're missing one important premise. UFS
> doesn't do "more than one file per frag" until the file system gets close to
> filling up, and the optimization switches from time to space. Therefore, in
> your example you're actually wasting more space than you would with 8k
> blocks, and as a side effect making the fs less efficient in at least 2 ways.

If you know that most of the files are write-once and don't grow over
time, you can tune the file system to always do space optimization.  I
used to do this with classic Usenet spools and it worked well.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Parking disk drive heads

2005-08-20 Thread Don Lewis
On 20 Aug, Eric Anderson wrote:

> As a data point, I've been using 64mb compact flash cards (rated at 100k 
> writes) in about 100 Soekris boxes (running FreeBSD) for about 4 years, 
> and they are all still working, except for one.  Now, most compact flash 
> cards are rated at 1 million writes.
> 
> And yes, I'm logging to the card and everything..

I've been using a laptop drive in my firewall and mail relay box for
noise and power consumption reasons.  The drive specs only give an
expected lifetime of a few years when running 24x7, and I just had to
replace a drive that had been in service about four years.  I've given
some thought to using flash, but I'm concerned about the number of
writes, especially since a mail relay (maybe 1K messages/day) is going
to be somewhat write intensive.  What would be nice is a flash-backed md
device that would flush its contents to flash on power fail.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: ffs_alloc.c: minfree Q

2005-08-10 Thread Don Lewis
On 10 Aug, Dmitry Morozovsky wrote:
> Colleagues,
> 
> 
> from ffs_alloc.c:
> 
> case FS_OPTSPACE:
> /*
>  * Allocate an exact sized fragment. Although this makes
>  * best use of space, we will waste time relocating it if
>  * the file continues to grow. If the fragmentation is
>  * less than half of the minimum free reserve, we choose
>  * to begin optimizing for time.
>  */
> request = nsize;
> if (fs->fs_minfree <= 5 ||
>--->>>~~
> fs->fs_cstotal.cs_nffree >
> (off_t)fs->fs_dsize * fs->fs_minfree / (2 * 100))
> break;
> log(LOG_NOTICE, "%s: optimization changed from SPACE to 
> TIME\n",
> fs->fs_fsmnt);
> fs->fs_optim = FS_OPTTIME;
> break;
> 
> For contemporary situation, where total size of file system can grow to 
> hundreds of Gs or even several Ts, 8% of space seems too high. 
> 
> Maybe this algorithm should be slightly adjusted (I'm thinking of logarithmic 
> scale depending on file system size)? 

I experimented with this back when I ran a Usenet server with a classic,
one article per file, spool.  If found that if I pushed the limit, I'd
often lose the ability to create files that were greater than or equal
to the file system block size because all of the free space consisted of
partial blocks that had one or more fragments allocated.  This would
happen even though df said the file system still had plenty of free
space.

The severity of this problem depends on the file size distribution and
its relationship to the file system block and fragment sizes, and
doesn't depend on the file system size.  If you double the size of the
file system, you can double the number of files stored before you run
into the problem, and you run into the problem at the about same
percentage of fullness no matter what the size of the file system.

You can avoid this problem if you set the fragment size the same as the
block size when you create the file system, but then the wasted space is
just hidden in the partially filled blocks at the end of each file,
where it is invisible to df.  This is similar to the behaviour of other
types of file systems that only have one allocation unit size.

Another problem that you are likely to run into if you run file systems
very nearly full is that eventually sequential I/O performance on larger
files tends to get very bad over time because the blocks contained in
each file get spread all over the disk, requiring a large number of
seeks to access them all. The number of contigous free blocks and the
distance between free blocks is going to depend on the percentage of
fullness and not the size of the file system.  If you have two file
systems of size N that are X% full, the distribution of the free space
in each file system and the I/O performance will be very similar to one
file system of size 2N that is also X% full.

A special case where cranking down minfree ok is when you have a
static set of a small number of large files that are created shortly
after the file system is newfs'ed so that the blocks allocated to each
file are largely contiguous.  Re-writing these files is even ok as long
as they are re-written in place and not truncated and re-extended.


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: problem with file system

2005-06-01 Thread Don Lewis
On  1 Jun, Eric Anderson wrote:
> GiZmen wrote:
>> Hi,
>> 
>> Recently my box had a power failure and after reboot when
>> i wanted to check my encrypted filesystem with fsck i 
>> have that message:
>> 
>> # fsck /dev/ad0s1g.bde
>> ** /dev/ad0s1g.bde (NO WRITE)
>> ** Last Mounted on /crypto
>> ** Phase 1 - Check Blocks and Sizes
>>  fsck_ufs: cannot alloc 2129430592 bytes for inoinfo
>> 
>> or:
>> 
>> # fsck_ffs -p /dev/ad0s1g.bde
>> /dev/ad0s1g.bde: NO WRITE ACCESS
>> /dev/ad0s1g.bde: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
>> 
>> 
>> i know that i have no write access but this file system is 
>> mounted right now and it seems that everything is ok.
>> I have tried to repair this when this file system was unmounted
>> but i had the same errors.
>> 
>> I don't know how to repair this file system. Could anyone point me
>> what to do?
> 
> I'm struggling with the same problem.  Suggestion so far from Don Lewis:
> 
> Try setting kern.maxdsiz to a larger value in /boot/loader.conf and
> rebooting.  I've got mine set to 1GB.
>   kern.maxdsiz="1073741824"
> 
> This didn't do it for me, but it might work for you.

I suspect this is a different problem.  In your case fsck_ufs was trying
to allocate a sane amount of memory, so my best guess was that your file
system was sufficiently large that you were running into the kernel
enforced datasize limit.  Run "limit" in your shell to double check that
the datasize limit increased.

In this case

>>  fsck_ufs: cannot alloc 2129430592 bytes for inoinfo

tells me that the power failure likely corrupted one of the cylinder
group blocks.  Here's my suggestion on how to fix this:

At line 92 in src/sbin/fsck_ffs/pass1.c, you should see the following
block of code:

for (c = 0; c < sblock.fs_ncg; c++) {
inumber = c * sblock.fs_ipg;
setinodebuf(inumber);
getblk(&cgblk, cgtod(&sblock, c), sblock.fs_cgsize);
if (sblock.fs_magic == FS_UFS2_MAGIC)
inosused = cgrp.cg_initediblk;
else
inosused = sblock.fs_ipg;

Try changing
inosused = cgrp.cg_initediblk;
to
inosused = (cgrp.cg_initediblk <= sblock.fs_ipg &&
cgrp.cg_initediblk > 0) ? cgrp.cg_initediblk :
sblock.fs_ipg;

Be prepared for the possibilty of a lot of file system damage.  You
might see a lot of files that claim the same blocks, and a lot of stuff
could end up in lost+found.  I recommend buying an UPS and installing
one of the UPS utilities from ports that does a clean shutdown before
the battery runs down.

At some point, I'd like to commit a proper fix to fsck, but that's a
little more involved and my day job is keeping me way too busy.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: problems with new the "contigmalloc" routine

2005-05-20 Thread Don Lewis
On 21 May, Peter Jeremy wrote:
> On Fri, 2005-May-20 21:51:34 +0200, Hans Petter Selasky wrote:

>>Can anyone explain why "uiomove()" has to sleep, and why there is no 
>>non-blocking "uiomove()"?
> 
> As far as I can see, uiomove() only sleeps if it is asked to do a
> kernel<->userland move that takes more than twice a scheduler quantum.
> As long as you don't uiomove() ridiculous amounts of data, it should
> never sleep.

It can also sleep if it stumbles across a userland page that isn't
resident.  When this happens, it will sleep until the page is retrieved
from swap.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Console ASCII interpretation

2005-05-16 Thread Don Lewis
On 16 May, alexander wrote:

> You're right. The code I'm using is wrong when it is being executed
> under the console. However the fact that Eterm and xterm do what I
> want to do with my app show that I'm not the only one who needs a
> NOP ascii value. Both render the NUL ascii code as NOP. Since both
> terms are much newer than the console this indicates that they
> reflect the recent changes in software development much better.

If you find pipe the output of your app through more while running in an
xterm window, or redirect the output to a file and view the file in your
favorite editor, you'll find another couple of cases where NUL isn't a
NOP.


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Re[2]: vn_fullpath()

2005-02-21 Thread Don Lewis
On 21 Feb, Robert Watson wrote:
> 
> On Mon, 21 Feb 2005, Igor Shmukler wrote:
> 
>> > So the first thing to do is to decied what your requirements are: are you
>> > willing to fail in the edge cases like the above?  If so, life is a lot
>> > easier :-). 
>> 
>> I guess I am willing to fail :). Perhaps in some distant future, we will
>> look into the nasty corner cases, but for now, as long as I get a name,
>> it will do. We don't even mind the hardlinks so much, but we cannot
>> afford to use existing vn_fullpath() because it does not guarantee
>> "anything".
> 
> There are a couple of issues to look at, if we can allow some obscure edge
> cases to fail, but want it to "generally" work:
> 
> (1) File systems that don't use the centralized name cache facility, such
> as procfs and devfs.
> 
> (2) What to do when useful paths fall out of the name cache.
> 
> I think the answer to (1) is to let those file systems simply provide a
> vnode operation to answer the question: they're almost always synthetic
> file systems, or they would be using the cache.  So I'm almost thinking: 
> 
> VOP_GETPATH(vp, char *buf)
> 
> The call would say to the file system "Tell me the path from your root to
> the vnode in question".
> 
> On the (2) front, I think there are a couple of possibilities -- the
> decision to let intermediate paths fall out of the name cache is an
> explicit design choice to reduce the vnode burden on the system.  We can
> either back off that design choice forcing intermediate nodes to generally
> remain in the cache, or we can accept it and address it.  My leaning is to
> add a new rule: "the last directory used to reach a file must not fall out
> of the cache if the file hasn't fallen out of the cache" -- with this in
> place, we can generate path names for most objects by walking back up the
> tree if elements are missing, either directly, or by asking the file
> system using the above call.  It's the last step from the file back to a
> parent directory that is the hardest.  Alternatively, we can back off
> dropping the intermediate nodes and see to what extent that hurts vs.
> helps.

I seem to recall that DragonFly keeps the intermediate nodes.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: MBUF statistics

2005-02-16 Thread Don Lewis
On 15 Feb, Max Laier wrote:
> On Tuesday 15 February 2005 12:38, Borja Marcos wrote:
>>  Hello,
>>
>>  Looking at the mbuf statistics available in FreeBSD 4 and FreeBSD 5 I
>> can see that the statistics available in FreeBSD 5 are, surprisingly,
>> much less comprehensive. Is there any other place where I can find out
>> how many mbuf requests have been done, how many of them have waited,
>> how many have failed, etc?
> 
> I use "$vmstat -z | grep Mbuf".  The netstat -m output is broken, because 
> fixing this would impose an additional atomic operation on each alloc/free 
> which is a real performance killer.

Why not maintain the statistics on a per-CPU basis, and sum up the
per-CPU statistics in the sysctl handler?  The handler might not get an
exact snapshot, but that shouldn't matter.

In the case of counters that are wider than 32 bits, it might be
necessary to break the counters up into chunks that can be incremented
atomically, and grab a lock when the least significant chunk overflows.


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: bug in calcru()

2005-01-31 Thread Don Lewis
On 26 Jan, Chris Landauer wrote:
> 
> hihi, doug -
> 
>> Doug Ambrisko <[EMAIL PROTECTED]> wrote
>>  ...
>>  The assumption with this calculation is that st & it tend to be
>>  small compared to tt so the 1024 X shouldn't overflow much.
>>  ...
>> [EMAIL PROTECTED] wrote:
>> |...but i'm a little worried that the 1024 multiplications aren't
>> |large enough when tt gets really large
>> | > Doug Ambrisko <[EMAIL PROTECTED]> wrote
>> | > ...
>> | > /* Subdivide tu. try to becareful of overflow */
>> | > su = tu * (st * 1024 / tt) / 1024;
>> | > iu = tu * (it * 1024 / tt) / 1024;
>> | > uu = tu - (su + iu);
>> | > ...
> 
> i'm not so worried about the overflow limit (that's what the mathematical
> analysis is intended to discover, and i assume that the bound is large enough
> to ignore the issue - that is the really clever part about computing su and iu
> first instead of uu), but the underflow - if st and it are small enough and tt
> is large enough, these equations produce 0 for both su and iu (and the
> reported percentage will rightly be 0.00%, but i want to see the rest of the
> detail for my time models)

It looks like the worst case for overflow would be if st == tt or it ==
tt, and even then overflow would not happen until the run time got up
above 500 years according to my calculations, so it would probably be
safe to increase the 1024 factor by quite a bit.  Even at 1024, the
system and interrupt time will be calculated to about 0.1%, though you
might want to tweak the formula a bit to do rounding.
su = tu * ((st * 1024 + 512) / tt) / 1024;

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: time and timing errors in c code on 5.x/i386 (longish)

2005-01-22 Thread Don Lewis
On 23 Jan, Peter Jeremy wrote:
> On Fri, 2005-Jan-21 14:49:41 -0800, Chris Landauer wrote:
>>i'm running some combinatorial search programs that take weeks or months to
>>complete, and no timer i've used is able to report correctly the user and
>>system time (they all make the same mistake - eventually the user time stops
>>incrementing) - i want precise times to do some predictive modeling
> 
> [evidence deleted]
> 
> The problem looks like an overflow error in calcru().  Have you seen any
> kernel messages beginning 'calcru:'?

I haven't on my 4.x machine.

> The offending code is:
> uu = (tu * ut) / tt;
> where all variables are uint64
>   uu is the user time in microseconds (that will be converted to a timeval
>  and reported via getrusage())
>   tu is the total usermode runtime allocated to your program (in usec)
>   ut is the number of usermode statclock hits (128Hz)
>   tt is the total (user+sys+int) statclock hits.
> 
>>user 378925.483628 syst 286.845375 elapse 381328.785295 pct 99.44%
>>user 379089.748458 syst 286.962284 elapse 381493.700660 pct 99.45%
>>user 379255.472355 syst 287.088004 elapse 381660.106387 pct 99.45%
>>user 379417.184286 syst 287.190223 elapse 381822.457863 pct 99.45%
>>user 379417.184286 syst 451.110470 elapse 381986.906692 pct 99.45%
>>user 379417.184286 syst 615.737725 elapse 382152.058304 pct 99.45%
> 
> At this point tu is roughly 379417184286 and ut is roughly 48565399
> The product is about 1.8e19 - which is roughly 2^64.
> 
> That particular code goes all the way back to BSD4.4lite so it's a bug
> that has always existed.  We can't use FP in the kernel and don't
> support 128-bit integers (or arithmetic) anywhere so a correct fix is
> quite ugly (and inefficient) in portable C.

I think this explains why setiathome thinks it has stopped accumulating
CPU time after a while.  I've mostly noticed this on my 4.x machine
because it runs 24x7 and tends to have long uptimes.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Patch for linux ABI for MSG_NOSIGNAL and out of order tcp packet issue

2005-01-18 Thread Don Lewis
On 18 Jan, David Malone wrote:
> On Tue, Jan 18, 2005 at 03:18:42PM -, Steven Hartland wrote:
>> The attached patch checks for
>> MSG_NOSIGNAL and if set enables SO_NOSIGPIPE
>> for the duration of send call.
> 
> I just had a quick look at the patch. The patch should probably
> use kern_setsockopt, which will simplify it considerably.
> (kern_setsockopt was introduced to FreeBSD 5 this summer to make
> it easier to do this sort of thing). It would probably also be
> better to do a kern_getsockopt first to find out if SO_NOPIPE is
> set and only turn it off afterwards if it wasn't already on.
> 
>> Im not 100% sure this is the
>> way to do it but have confirmed that the patch works on
>> 5.2.1 so if someone could check and commit it that would
>> be great.
> 
> I guess that it would be even better if we could just pass
> SO_NOPIPE to send, or even implement MSG_NOSIGNAL on FreeBSD,
> but your patch is probably a reasonably start.

That's probably the best solution.  We did the same thing to properly
implement non-blocking I/O on fifos.  Setting and clearing the socket
option for each syscall adds a lot of overhead, and there is also danger
that some other thread could be modifying the option at the same time.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: flushing disk buffer cache

2004-10-29 Thread Don Lewis
On 29 Oct, Siddharth Aggarwal wrote:
> 
> Thanks for your reply.
> 
> Hmm. At the moment, the user can send an ioctl to define a checkpoint. But
> I would guess that this could happen between 2 strategy() function calls
> corresponding to the same filesystem operation?

Yes.

> So if there a way to block
> filesystem operations while a snapshot is taken? I can't unmount an active
> filesystem before the snapshot and remount it after. Any suggestions?

Yes, this is done by the following code in ffs_snapshot():

/*
 * Suspend operation on filesystem.
 */
for (;;) {
vn_finished_write(wrtmp);
if ((error = vfs_write_suspend(vp->v_mount)) != 0) {
vn_start_write(NULL, &wrtmp, V_WAIT);
vn_lock(vp, LK_EXCLUSIVE | LK_RETRY, td);
goto out;
}
if (mp->mnt_kern_flag & MNTK_SUSPENDED)
break;
vn_start_write(NULL, &wrtmp, V_WAIT);
}

I think the snapshot code works at a higher level than what you are
implementing, so I believe that the snapshot code doesn't need to sync
all the files to create a consistent snapshot.  You may run into
problems with syncing unwritten data while writing is suspended, but I'm
not sure because don't understand the code all that well.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: flushing disk buffer cache

2004-10-29 Thread Don Lewis
On 29 Oct, Siddharth Aggarwal wrote:
> 
> Another related question ...
> 
> Is it possible to delay or queue up disk writes until I exit from my
> function in the kernel (where I am trying to sync with the disk)? Or
> make sure that my sync function never goes to sleep waiting for the disk
> driver to signal completion of flushes to disk?

Take a look at how the snapshot code handles this.  It has to be done
above the level of individual disk operations because certain file
system operations require multiple disk I/O operations to transform the
file system from one consistent state to another consistent state.  If
you try to checkpoint in the middle of this sequence, you will capture a
state where the file system is internally inconsistent.


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Printing from kernel

2004-10-06 Thread Don Lewis
On  7 Oct, Roman Kurakin wrote:
> Hi,
> 
>   I have some problems with printing from kernel.
> At first I think that my problems was cause I use printf,
> but changed all of them to log cause it safe to use from
> interrupt handlers. The situation become better but I still
> observe system lockup in case I output some debug information
> from my driver.
> 
>   Also I have some problems with system console via com
> port. Instead of messages from kernel I see the first letter
> of the month name.

This is a bug in syslogd related to non-blocking I/O that bde and I
discussed quite a while back, though we never figured out a proper fix.
I recently made the interesting discovery that the same problem isn't
present on sparc64.

I think it'll start working again if you restart syslogd.

>   Could anybody comment my observation? Does anybody
> saw anything like this?
> 
>   Oh, I forget to say I observe that with both Current
> and Releng5, SMP. Also I can't trigger NMI so I can't see the
> point of lockup.

I generally use printf for this sort of thing, and I was going to
suggest that you take a look at the KTR stuff, but that won't help if
the machine totally locks up so that you can't get to the KTR buffer.

I think you'll have trouble getting close to the bug if you use log
because of the log latency from the generation of the message, passing
it through syslogd, and back to the kernel to be printed.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: FreeBSD Kernel buffer overflow

2004-09-20 Thread Don Lewis
On 20 Sep, [EMAIL PROTECTED] wrote:

>> cat kern_syscalls.diff
> --- kern_syscalls.c Sat Sep 18 13:42:21 2004
> +++ kern_syscalls2.cMon Sep 20 14:18:45 2004
> @@ -58,6 +58,16 @@
>  syscall_register(int *offset, struct sysent *new_sysent,
>  struct sysent *old_sysent)
>  {
> +#ifndef __ia64__
> +   if (new_sysent->sy_narg < 0 || new_sysent->sy_narg > MAX_SYSCALL_ARGS)
> +   {
> +   printf("Invalid sy_narg for syscall: boundary is [0 - %d]\n",
> +   MAX_SYSCALL_ARGS);
> +   return EINVAL;
> +   }
> +#endif
> +
> +

It would probably be better to change the #ifndef to
#ifdef MAX_SYSCALL_ARGS

I would also add new_sysent->sy_narg to the printf().

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: FreeBSD Kernel buffer overflow

2004-09-19 Thread Don Lewis
On 19 Sep, [EMAIL PROTECTED] wrote:
> 
>>Don,
>>
>>This sounds excellent.  Can an src-committer verify that the following
> is
>>ok and commit it along with the manpage diff I posted earlier to HEAD?
>>
>>The hard-wired number 8 in there seems like something that could probably
>>be improved a lot, but after looking for a short while I couldn't find
> a
>>good way of finding out from the arguments of syscall_register() some way
>>to calculate it.  Of course, I'm far from an experienced kernel hacker
> and
>>I'm probably missing something.  Feel free to correct the following diff
>>or
>>even replace it entirely.
> 
> Maybe you can get a look at this approach:
> 
> ==
> 
> $arch/include/md_var.h:
> 
>> cat md_var.diff
> --- md_var2.h   Sun Sep 19 22:43:56 2004
> +++ md_var.hSun Sep 19 22:46:23 2004
> @@ -41,6 +41,12 @@
>  extern int (*copyin_vector)(const void *udaddr, void *kaddr, size_t
> len);
>  extern int (*copyout_vector)(const void *kaddr, void *udaddr, size_t
> len);
> 
> +/*
> + * Arguments number syscalls definition
> + */
> +
> +#define MAGIC_SYSCALL_ARGS 8
> +
>  extern longMaxmem;
>  extern u_int   basemem;/* PA of original top of base memory */
>  extern int busdma_swi_pending;

 which is installed from
src/sys/{alpha,amd64,i386,ia64,etc}/param.h would be a more appropriate
location.  There may be cases where you would want to know this value in
userland, in which case including  would definitely
not be appropriate.

My preference would be to name it MAX_SYSCALL_ARGS.


> 
> 
> kern/kern_syscall.c:
>> cat kern_syscall.diff
> --- kern_syscalls.c Sat Sep 18 13:42:21 2004
> +++ kern_syscalls2.cSun Sep 19 23:00:44 2004
> @@ -27,6 +27,8 @@
>  #include 
>  __FBSDID("$FreeBSD: src/sys/kern/kern_syscalls.c,v 1.11 2004/07/15 08:26:05
> phk Exp $");
> 
> +#include 
> +
>  #include 

 includes , so if the #define is added to
 you won't have to include  here.

The rest of the changes look ok, though you might want to add a printf()
before "return EINVAL" so that the reason for failure gets logged.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: kernel buff overflow

2004-09-19 Thread Don Lewis
On 19 Sep, Giorgos Keramidas wrote:
> On 2004-09-19 15:04, [EMAIL PROTECTED] wrote:
>> --- kern_syscalls.c Sat Sep 18 13:42:21 2004
>> +++ kern_syscalls2.cSun Sep 19 14:59:27 2004
>> @@ -58,6 +58,12 @@
>>  syscall_register(int *offset, struct sysent *new_sysent,
>>  struct sysent *old_sysent)
>>  {
>> +
>> +#ifdef __i386__
>> +if (new_sysent->sy_narg < 0 || new_sysent->sy_narg > i386_SYS_ARGS)
>> +return E2BIG;
>> +#endif
>> +
>> if (*offset == NO_SYSCALL) {
>> int i;
> 
> If a very simple but similar check can be added that works for all the
> architectures it's probably a cleaner solution, i.e.:
> 
> : #ifndef SYSCALL_MAX_ARGS
> : #define SYSCALL_MAX_ARGS8
> : #endif
> :
> : if (new_sysent->sy_narg < 0 || new_sysent->sy_narg > SYSCALL_MAX_ARGS)
> : return EINVAL;
> 
> Then each architecture can define SYSCALL_MAX_ARGS at compile time.

Yes, the value should be defined in the architecture-specific
.  Also the machine specific syscall handlers in trap.c
should be modified to use the defined parameter instead of just using
the architecture-specific magic number.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: FreeBSD Kernel buffer overflow

2004-09-18 Thread Don Lewis
On 18 Sep, [EMAIL PROTECTED] wrote:
> Here i report a patch different from Giorgos' one. The approch is completely
> different: working on syscall_register() function in kern/kern_syscalls.c
> file.
> 
> ==
> 
>> cat kern_syscalls.diff
> --- kern_syscalls.c Sat Sep 18 14:37:53 2004
> +++ kern_syscalls2.cSat Sep 18 14:37:53 2004
> @@ -73,6 +73,11 @@
> sysent[*offset].sy_call != (sy_call_t *)lkmressys)
> return EEXIST;
> 
> +#if (__i386__) && (INVARIANTS)
> +   KASSERT(new_sysent->nargs >= 0 && new_sysent->nargs <= i386_SYS_ARGS,
> +   "invalid number of syscalls");
> +#endif
> +
> *old_sysent = sysent[*offset];
> sysent[*offset] = *new_sysent;
> return 0;

Why panic the machine at this point?  Just refuse to install the syscall
and return an error.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: FreeBSD Kernel buffer overflow

2004-09-18 Thread Don Lewis
On 18 Sep, Pawel Jakub Dawidek wrote:
> On Fri, Sep 17, 2004 at 12:37:12PM +0300, Giorgos Keramidas wrote:
> +> % +#ifdef INVARIANTS
> +> % +   KASSERT(0 <= narg && narg <= 8, ("invalid number of syscall args"));
> +> % +#endif
> 
> Maybe:
> KASSERT(0 <= narg && narg <= sizeof(args) / sizeof(args[0]),
> ("invalid number of syscall args"));
> 
> So if we decide to increase/decrease it someday, we don't have to remember
> about this KASSERT().

What keeps the attacker from installing two syscalls, the first of which
pokes NOPs over the KASSERT code, and the second of which accepts too
many arguments?

If you think we really need this bit of extra security, why not just
prevent the syscall with too many arguments from being registered by
syscall_register()?  At least that keeps the check out of the most
frequently executed path.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: FreeBSD Kernel buffer overflow

2004-09-17 Thread Don Lewis
On 18 Sep, Matt Emmerton wrote:
> 
> - Original Message - 
> From: "Mike Meyer" <[EMAIL PROTECTED]>
> To: "Matt Emmerton" <[EMAIL PROTECTED]>
> Cc: <[EMAIL PROTECTED]>; "Avleen Vig"
> <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>;
> <[EMAIL PROTECTED]>
> Sent: Saturday, September 18, 2004 1:22 AM
> Subject: Re: FreeBSD Kernel buffer overflow
> 
> 
>> In <[EMAIL PROTECTED]>, Matt Emmerton
> <[EMAIL PROTECTED]> typed:
>> > I disagree.  It really comes down to how secure you want FreeBSD to be,
> and
>> > the attitude of "we don't need to protect against this case because
> anyone
>> > who does this is asking for trouble anyway" is one of the main reason
> why
>> > security holes exist in products today.  (Someone else had brought this
> up
>> > much earlier on in the thread.)
>>
>> You haven't been paying close enough attention to the discussion. To
>> exploit this "security problem" you have to be root. If it's an
>> external attacker, you're already owned.
> 
> I'm well aware of that fact.  That's still not a reason to protect against
> the problem.
> 
> If your leaky bucket has 10 holes in it, would you at least try and plug
> some of them?

If an attacker is allowed to install arbitrary syscalls, he might as
well install one that is easier to exploit.

struct write2kernel_args {
void*ubuf;
void*kbuf;
size_t  nbyte;
};
void
write2kernel(td, uap)
struct thread *td;
struct write2kernel_args *uap;
{
 
copyin(uap->ubuf, uap->kbuf, nbyte);
}

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: amd/autofs on BSD (was Re: waiting on sbwait)

2004-06-25 Thread Don Lewis
On 25 Jun, Brandon D. Valentine wrote:
> On Fri, Jun 25, 2004 at 02:43:03PM +0300, Danny Braniss wrote:
>> I understand, but the problem is that all access via amd are now stalled, till
>> the one process failes/times-out. I guess it's because the single thread amd.
> 
> This is an excellent opportunity to ask:
> 
> Is anyone working on or does anyone know of a project to develop a real
> autofs implementation for BSD that's compatible with the autofs
> implementations on every other UNIX?  I see where am-utils got some
> 'gamma quality' autofs support added for Linux and Solaris systems, but
> still nada on BSD.  IMO amd has probably outlived its usefulness and
> should be shot in the head and replaced with a Sun-style automounter.
> It does much to discourage BSDs adoption in large multiuser environments
> when it requires special cruft to make it play nicely with automounting.

In two different large scale automounter deployments, I've used amd
rather than the old Sun automounter or autofs because of the extra
flexibility in amd.  For instance, I can use different mount options
(rsize, wsize, timeouts, etc.) when mounting file systems between sites
versus mounting file systems local to the LAN while only having to
maintain one set of maps.  I've also found good uses for type:=link
maps.  It's also useful to be able to distribute maps via hesiod (DNS),
especially when the enterprise has multiple NIS domains.

That said, I would dearly like to avoid the need for a separate mount
point for the automounted file systems.  It's ugly when this wierd path
shows up in the outout of pwd.  It is even worse when software caches
this path and tries to return there after the file system has been
unmounted and fails because it is bypassing the automounter.  Autofs is
much nicer in that respect.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: waiting on sbwait

2004-06-25 Thread Don Lewis
On 24 Jun, Danny Braniss wrote:

> found the cause: NFS/amd
> a user had several symlinks to /net/host/xyz, and host was down.
> doing ls -F /net/host/xyz does the trick, the machine becomes
> unresponsive.

/net is evil.  A fun trick is to attempt to access
/net/nonexistenthost and watch amd wedge while it gropes around the DNS
tree looking for nonexistenthost.  At a previous job I stumbled across
this when I noticed that amd would hang whenever we lost our internet
connection.  Watching DNS queries sent to the Internet revealed all
sorts of interesting things that could be considered to be a leak of
sensitive information.

I also don't like NFS exporting /, and accessing /net/machine doesn't
work too well if / isn't exported.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: regarding signals...

2004-06-22 Thread Don Lewis
On 22 Jun, pradeep reddy punnam wrote:
> Hi,
>  
> i am modifing my ../netinet/ip_input.c code so that kernel can inform a 
> user process about the arrival of a packet, i want to use signaling 
> mechanism for this , i know the pid of the process to which the signal 
> should be send, i am looking for exact function that can help me in 
> sending SIGIO to procss...
> i tryed to use the kill and psignal functions but the system going  
> panic when the packet arrives...may be my use of the fuctions is wrong...
> can i call a system call from the kernel
> somebody tell me what functions are suitable to call for such a 
> situation
> thanking you...

Take a look at how the various FIOSETOWN ioctl() handlers are written.
The pid or process group id is passed to ioctl(), and the kernel passes
this to fsetown(), which does a lookup on the pid (or pgrp id) and
stores a pointer to the process or process group in a struct sigio, and
a pointer to this structures is stored in the location specified as the
second argument to fsetown(). When the file descriptor that was passed
to ioctl() is closed, funsetown() gets called.

When an event that should trigger the SIGIO is detected, pgsigio()
should be called with a pointer to a pointer to the appropriate struct
sigio.

One of the things that gets handled automagically is that when the
process or process group that is supposed to receive the SIGIO exits,
the SIGIO handling is disabled so that some other process that inherits
the same pid at a later time doesn't start receiving unexpected signals.

Instead of writing some custom kernel code for this, why don't you just
use bpf, which already implements the FIOSETOWN ioctl() call?



___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: HEADS UP! KSE needs more attention

2004-06-06 Thread Don Lewis
On  6 Jun, Daniel Eischen wrote:
> On Sun, 6 Jun 2004, Marcel Moolenaar wrote:
> 
>> On Sun, Jun 06, 2004 at 02:31:56PM -0600, Scott Long wrote:
>> 
>> > As with Alpha,
>> > the fate of a platform rests on the people who are willing to work on
>> > it, not on whether it is in a particular list.
>> 
>> Agreed, but it's the projects responsibility to take the tierness and
>> the intend to support multiple platforms serious and not to chicken out
>> at the first signs of complications or hurdles. We labeled sparc64 as
>> a tier 1 platform and we better deal with the consequences.
> 
> Not to take away from the tremendous effort that jake had done for
> sparc64, but it should really take more than one or two supporting
> developers to obtain tier 1 support.  People come and go, and
> tierness should take that into account.

I've got some sparc64 hardware that recently became available for
FreeBSD develpment.  Unfortunately my time available to FreeBSD is
likely to be the limiting factor.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: How do inodes work?

2004-05-16 Thread Don Lewis
On 16 May, David Malone wrote:
> On Sun, May 16, 2004 at 02:25:37AM -0300, Marc G. Fournier wrote:
>> so I take there are 'gaps' in the inode list?  it doesn't re-use freed
>> ones but keeps climbing until maybe it rolls around or something?
> 
> A particular numbered inode always lives in the same place on the
> disk. When choosing what inode to use for a new file, the filesystem
> tries to pick a inode to put the file close to the directory it is
> being created in. This is the dirpref optimisation introduced a few
> years ago - previously inodes were chosen from a part of a disk that
> had the most nearby free space.

The preferred location of inodes for regular files has always been in
the same cylinder group as their parent directory.  The dirpref
optimization changed the policy for selecting the cylinder group when
new directories are created.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: non-recursive mutex, sbc, pcm, kernel panic

2004-05-09 Thread Don Lewis
On  9 May, Bartek Marcinkiewicz wrote:
> Hello,
> 
> I've just experienced kernel panic while trying
> to play mp3 file. (My sound card: Creative Sound
> Blaster 16, ISA, worked fine on older 5.x system).
> 
> While reading code:
> 
> sys/dev/sound/pcm/sound.c::snd_mtxcreate() creates 
> non-recursive mutex.
> 
> But in sys/dev/sound/isa/sb16.c::sb_setup() we have:
> 
> sb_setup()
> {
> 
>   sb_lock(sb);
> 
>   /* ... */
> 
>   sb_reset_dsp(sb);
> 
>   /* ... */
> }
> 
> sb_reset_dsp() function locks this mutex again, causing 
> panic with message: 
> _mtx_lock_sleep: recursed on non-recursive mutex sbc0 @...
> 
> Is this known issue? Why this mutex should be non-recursive?
> Attached patch makes it work again.

This isn't a known issue.  It looks like you are the first person to
exercise the sb16 driver in FreeBSD-5.x in quite some time.  Allowing
recursive locks makes it much more difficult to get the locking correct
since you lose the ability to use assertions about the lock state at
function entry and exit.

Typically the proper fix in this case would be to remove the
sb_lock() call from sb_reset_dsp() and always let the caller do the
locking.  The patch below should do the trick, though I believe the
added locking and unlocking (the second section of the patch) could be
omitted in sb16_attach() since no other thread can access the device
while the attach is in progress.

Index: sys/dev/sound/isa/sb16.c
===
RCS file: /home/ncvs/src/sys/dev/sound/isa/sb16.c,v
retrieving revision 1.83
diff -u -r1.83 sb16.c
--- sys/dev/sound/isa/sb16.c14 Apr 2004 14:57:48 -  1.83
+++ sys/dev/sound/isa/sb16.c9 May 2004 19:33:09 -
@@ -266,12 +266,10 @@
 {
u_char b;
 
-   sb_lock(sb);
sb_wr(sb, SBDSP_RST, 3);
DELAY(100);
sb_wr(sb, SBDSP_RST, 0);
b = sb_get_byte(sb);
-   sb_unlock(sb);
if (b != 0xAA) {
DEB(printf("sb_reset_dsp 0x%lx failed\n",
   rman_get_start(sb->io_base)));
@@ -799,8 +797,12 @@
 
if (sb16_alloc_resources(sb, dev))
goto no;
-   if (sb_reset_dsp(sb))
+   sb_lock(sb);
+   if (sb_reset_dsp(sb)) {
+   sb_unlock(sb);
goto no;
+   }
+   sb_unlock(sb);
if (mixer_init(dev, &sb16mix_mixer_class, sb))
goto no;
if (snd_setup_intr(dev, sb->irq, 0, sb_intr, sb, &sb->ih))

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: 5.2.1-RC hangs occasionally when sound files are played using the pcm driver

2004-02-23 Thread Don Lewis
On 23 Feb, Brian O'Shea wrote:
> --- Brian O'Shea <[EMAIL PROTECTED]> wrote:
>> --- Mathew Kanner <[EMAIL PROTECTED]> wrote:
>> > 
>> >Hello Brian,
>> >Don Lewis commited changes to the 5.2- tree on 2/14.  Could
>> > you update and try again, also please do run with witness and
>> > invariants, and if possible try to get a crashdump so we can see
>> > what's happening.
>> 
>> Sure.  I'll try to reproduce it with witness and invariants first, just
>> to avoid changing too many conditions at the same time.  Failing that,
>> I'll update to the latest 5.2 tree and see if the problem goes away.
> 
> makeoptions DEBUG=-g#Build kernel with gdb(1) debug symbols
> options INVARIANTS  #Enable calls of extra sanity checking
> options INVARIANT_SUPPORT   #Extra sanity checks of internal
> structures, required by INVARIANTS
> options WITNESS #Enable checks to detect deadlocks and
> cycles
> #optionsWITNESS_SKIPSPIN#Don't run witness on spinlocks for
> speed
> 
> Ok, the problem seems to happen much more reliably now, but still no
> panic (so no crash dump).  However, I did notice that for some reason
> I had built an SMP kernel.  All kernels that I have seen exhibit this pcm
> driver hang were apparently SMP kernels...I guess that's the default
> in GENERIC, which I copied and adapted for my kernel config.  As my system
> only has one CPU, I will build a UP kernel and see if that has any
> effect.

Witness won't trigger a panic.  It just prints some information to the
console about the error.

SMP is now the default and runs fine on UP boxes.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: 5.2.1-RC hangs occasionally when sound files are played using the pcm driver

2004-02-23 Thread Don Lewis
On 23 Feb, Brian O'Shea wrote:
> --- Don Lewis <[EMAIL PROTECTED]> wrote:
>> 
>> The cause of deadlocks is more likely to be caught by WITNESS.  In this
> 
> With WITNESS the hang still occurrs, and still no panic.  It's hard to
> tell since the problem tends to happen at random times, but it seems like
> it happens more quickly with the kernel that has WITNESS and INVARIANTS
> enabled.

Can you see any kernel diagnostic messages, or are you running X?  If
you are running X, can you set up a serial console, or can you reproduce
the hang while switched to a text console?

Depending on the cause of the problem, you might need to build a kernel
with DDB support and break into DDB from the console while the machine
is hung to figure out the cause.

>> case it might be the result of a malloc() call while a mutex is held.
>> Even the version of the sound code in the most recent -CURRENT has some
>> problems in this area.  I've got a patch out for testing that will this
>> problem to some extent, but it might not be enough.
>> 
> 
> I'd be willing to test your patch, if that would help.  Where can I
> find it?

The patch below should be applied to a recent version of -CURRENT.

Index: sys/dev/sound/pcm/dsp.c
===
RCS file: /home/ncvs/src/sys/dev/sound/pcm/dsp.c,v
retrieving revision 1.73
diff -u -r1.73 dsp.c
--- sys/dev/sound/pcm/dsp.c 21 Feb 2004 21:10:47 -  1.73
+++ sys/dev/sound/pcm/dsp.c 22 Feb 2004 23:11:43 -
@@ -444,7 +444,7 @@
 static int
 dsp_ioctl(dev_t i_dev, u_long cmd, caddr_t arg, int mode, struct thread *td)
 {
-   struct pcm_channel *wrch, *rdch;
+   struct pcm_channel *chn, *rdch, *wrch;
struct snddev_info *d;
intrmask_t s;
int kill;
@@ -477,22 +477,19 @@
if (kill & 2)
rdch = NULL;

-   if (rdch != NULL)
-   CHN_LOCK(rdch);
-   if (wrch != NULL)
-   CHN_LOCK(wrch);
-
switch(cmd) {
 #ifdef OLDPCM_IOCTL
/*
 * we start with the new ioctl interface.
 */
case AIONWRITE: /* how many bytes can write ? */
+   CHN_LOCK(wrch);
 /*
if (wrch && wrch->bufhard.dl)
while (chn_wrfeed(wrch) == 0);
 */
*arg_i = wrch? sndbuf_getfree(wrch->bufsoft) : 0;
+   CHN_UNLOCK(wrch);
break;
 
case AIOSSIZE: /* set the current blocksize */
@@ -502,12 +499,16 @@
p->play_size = 0;
p->rec_size = 0;
if (wrch) {
+   CHN_LOCK(wrch);
chn_setblocksize(wrch, 2, p->play_size);
p->play_size = sndbuf_getblksz(wrch->bufsoft);
+   CHN_UNLOCK(wrch);
}
if (rdch) {
+   CHN_LOCK(rdch);
chn_setblocksize(rdch, 2, p->rec_size);
p->rec_size = sndbuf_getblksz(rdch->bufsoft);
+   CHN_UNLOCK(rdch);
}
}
break;
@@ -515,37 +516,51 @@
{
struct snd_size *p = (struct snd_size *)arg;
 
-   if (wrch)
+   if (wrch) {
+   CHN_LOCK(wrch);
p->play_size = sndbuf_getblksz(wrch->bufsoft);
-   if (rdch)
+   CHN_UNLOCK(wrch);
+   }
+   if (rdch) {
+   CHN_LOCK(rdch);
p->rec_size = sndbuf_getblksz(rdch->bufsoft);
+   CHN_UNLOCK(rdch);
+   }
}
break;
 
case AIOSFMT:
+   case AIOGFMT:
{
snd_chan_param *p = (snd_chan_param *)arg;
 
if (wrch) {
-   chn_setformat(wrch, p->play_format);
-   chn_setspeed(wrch, p->play_rate);
+   CHN_LOCK(wrch);
+   if (cmd == AIOSFMT) {
+   chn_setformat(wrch, p->play_format);
+   chn_setspeed(wrch, p->play_rate);
+   }
+   p->play_rate = wrch->speed;
+   p->play_format = wrch->format;
+   CHN_UNLOCK(wrch);
+   } else {
+   p->play_rate = 0;
+  

Re: 5.2.1-RC hangs occasionally when sound files are played using the pcm driver

2004-02-23 Thread Don Lewis
On 23 Feb, Mathew Kanner wrote:
> On Feb 23, Brian O'Shea wrote:
>> --- Mathew Kanner <[EMAIL PROTECTED]> wrote:
>> > 
>> >Hello Brian,
>> >Don Lewis commited changes to the 5.2- tree on 2/14.  Could
>> > you update and try again, also please do run with witness and
>> > invariants, and if possible try to get a crashdump so we can see
>> > what's happening.
>> 
>> Sure.  I'll try to reproduce it with witness and invariants first, just
>> to avoid changing too many conditions at the same time.  Failing that,
>> I'll update to the latest 5.2 tree and see if the problem goes away.
> 
>   I think it would be better to go to 5.2.1 and start testing
> from there.  I believe that one of the things that Don improved was a
> condition where sound could randomnly stomp other areas of the kernel.

The memory corruption problem only happens if vchans are enabled and
typically causes wierd system panics instead of hangs.  Also, the commit
that I just did on the RELENG_5_2 branch only changes a KASSERT() to a
panic() so that the cause of the crash is obvious, since INVARIANTS is
disabled by default on this branch, which disables KASSERT().

The cause of deadlocks is more likely to be caught by WITNESS.  In this
case it might be the result of a malloc() call while a mutex is held.
Even the version of the sound code in the most recent -CURRENT has some
problems in this area.  I've got a patch out for testing that will this
problem to some extent, but it might not be enough.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Zombie processes not clearing (solution)

2004-02-10 Thread Don Lewis
On 10 Feb, Steven Hartland wrote:
> Ok looks like this has been spotted before and its a kernel issue
> described here:
> http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&selm=bausg0%2416c5%241%40FreeBSD.csie.NCTU.edu.tw&rnum=13

If the kernel patch mentioned above fixes the problem, I'll commit it to
-CURRENT.  I never got any feedback when I previously posted it.

> I've made a 5.1 and 5.2 patch available from:
> ftp://ftp.multiplay.co.uk/pub/games/fps/battlefield1942/patches/FreeBSD/

I haven't been able to access that FTP server.  It always tells me that
the maximum number of allowed clients is connected.

> This affects all linux threaded apps which don't set SIGCHLD or
> do an explicit wait().

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: how to fool gcc?

2004-02-10 Thread Don Lewis
On 10 Feb, Dag-Erling Smørgrav wrote:

If you don't minde a bit of bloat, maybe changing this:

>  openpam_log(PAM_LOG_DEBUG, "returning '%s'", (s)); \

to this:

   openpam_log(PAM_LOG_DEBUG, "returning '%s'", (s) != NULL ? (s) : "");

might quiet the warning.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: send(2) does not block, send(2) man page wrong?

2004-01-23 Thread Don Lewis
On 23 Jan, Stuart Pook wrote:
>> send() for UDP should block if the socket is filled and the interface
>> can't drain the data fast enough.
> 
> It doesn't (at least I cannot make it block)
> 
>> Good question.  There is not feedback loop like in tcp, so handling this
>> blocking and releasing would be a little bit harder to do for UDP.
> 
> Send(2) indicates that it should do so.
> 
>> > I have written a test program,
>> > http://www.infres.enst.fr/~pook/send/server.c, that shows that send does
>> > not block on FreeBSD.  It does with Linux and Solaris.
>> 
>> Do you know what the behaviour of Net- and/or OpenBSD is?
> 
> NetBSD is the same as FreeBSD.  I have not tested OpenBSD.
> MacOS X is similiar to FreeBSD in that send doesn't block, howver
> the send does not give an error: the packet is just thrown away.

Which is the same result as you would get if the bottleneck is just one
network hop away instead of at the local NIC.

Even if you changed the network stack to block or return an error when
it detected that it was tossing packets away, the application has no way
of knowing that all, a majority of, or even any of its data was getting
though even though it wasn't blocked by send() and didn't receive any
error returns.  Think about the case of a gigabit LAN connected to the
Internet over a modem link.  Even with a stack that blocked send() so
that no packets were lost in the stack, the application would think it
was sending data to a peer on the Internet at gigabit speeds, but in
reality most of the traffic would be silently dropped.  Even within the
LAN, traffic could be dropped if the outgoing switch port was more
congested than the link from the sending host to the NIC.

If you want to send a lot of data as fast as possible using UDP, then
you'll probably need to reinvent the TCP congestion avoidance algorithms
in your application so that you don't overly impact the network.  The
application can't rely on send() blocking or returning errors, since you
don't know that the local network interface is the bottleneck.  Since
the bottleneck could be anywhere, the application code is simpler if it
relies on cues that are the same no matter where the bottleneck is
located rather than adding extra code just to handle a local bottleneck.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Where is FreeBSD going?

2004-01-06 Thread Don Lewis
On  5 Jan, Brett Glass wrote:

> It's probably one of the Slashdot "BSD is dead" trolls. The fact is, though,
> that there ARE things about FreeBSD that could stand improvement. These
> days, when I build a box, I am torn between using FreeBSD 5.x -- which is
> not ready for prime time but is at least being worked on actively -- and
> using 4.9, which isn't as stable as it should be because the developers
> broke the cardinal rule of making radical changes to -STABLE. This *is*
> a real issue for those of us who are admins.

The worst breakage of 4-STABLE in recent memory was the PAE commit,
which I got the impression was driven by end-user demand.  Probably
folks who had expensive systems with > 4GB of RAM who wanted to be able
to run 4-STABLE production systems and make use of all that RAM right
now and not wait for 5.x to become production-worthy.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Power consumption in desktop computers

2004-01-01 Thread Don Lewis
A few more data points:

System 1 (FreeBSD 4-STABLE)
  Pentium II 400
  Asus P2B-LS motherboard (fxp + aic7890 Ultra2 SCSI on board)
  384 MB ECC RAM
  Matrox G20
  Floppy
  Seagate ST336737LW
  Seagate ST39173LW
  Plextor PX-R412C CD-R
  Tandberg SLR-5
  Floppy
  Supermicro Case with lots of fans
  Idle:   72W / 112VA
  Active: 88W / 134VA


System 2 (FreeBSD 4-STABLE)
  Aopen AX34-U motherboard
  1 GHz Celeron underclocked to 668 MHz (66 MHz FSB instead of rated 100
MHz)
  256 MB ECC RAM
  cheap PCI video card
  IBM IC25N010ATDA04 10 GB laptop drive
  1x sis ethernet
  1x fxp ethernet
  no floppy
  90 W (?) power supply
  Active(?): 39 W / 57 VA


System 3 (FreeBSD 5-CURRENT)
  Gigabyte GA7-DX+ motherboard
  Athlon XP 1900+
  1 GB ECC RAM
  Adaptec 19160B SCSI controller
  Seagate ST336706LW
  ancient NEC CD-ROM drive
  fxp ethernet
  g-force 2mx video
  floppy
  Antec True Power 330W supply
  Idle: 135 W / 198 VA
  Active:   153 W / 223 VA


The power measurements were made with a Radio Shack "Kill A Watt" meter.

System #3 is my second fastest machine.  My 1.5 GHz Pentium-M laptop is
slightly faster on the "make buildworld" benchmark.  I haven't gotten
around to measuring its power consumption yet, but it runs a whole lot
cooler than the Athlon and its power supply is rated at 72 W.

The newer Athlon processors would be interesting, except that I haven't
found a motherboard that supports them with ECC RAM.  About the only
Athlon motherboards that support ECC RAM use the AMD 761 chipset, which
is getting rather dated.

Has anyone done power consumption measurements on amd64 systems?

There are some Pentium-M motherboards available now that support ECC
RAM.  They are targeted at the embedded market and are fairly pricey,
but I would expect them to perform well and have low power consumption.



___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Update: PR bin/60636

2004-01-01 Thread Don Lewis
On 29 Dec, Jason Slagle wrote:
> 
> So I'm not the only one?
> 
> After submitting a single PR, I started getting innudated with klez
> emails.  12+ a day, to an account that had never received one.
> 
> Me thinks it's time to obscure the emails somehow.

That might not help.  I would think that it is more likely that an
infected host is subscribed to the freebsd-bugs@ list and is harvesting
email addresses from the From: header instead of something doing
periodic sweeps of the PR database.  I'd be even more suspicious that
this is the case if the spewage started very shortly after the PR was
filed.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: New complaints

2003-11-14 Thread Don Lewis
On 14 Nov, M. Warner Losh wrote:
> checking stopevent 2 with the following non-sleepable locks held:
> exclusive sleep mutex sigacts r = 0 (0xc48c5aa8) locked @ kern/subr_trap.c:260
> checking stopevent 2 with the following non-sleepable locks held:
> exclusive sleep mutex sigacts r = 0 (0xc48c5aa8) locked @ kern/subr_trap.c:260
> 
> Lots and lots of these (thousands) when I login.  pwd causes this.
> This is as of 0230 UTC Nov 15, 2003.  Yesterday's kernel (2300 UTC Nov
> 13, 2003), on a different machine, doesn't seem to have this problem.
> 
> Is anybody else seeing this?

There have been an number of reports of this earlier today.  I just
upgraded and got bit by it as, too.  The culprit is sys/sys/proc.h
1.359.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Modem won't connect at full speed

2003-08-20 Thread Don Lewis
On 20 Aug, Lars Eighner wrote:
> On Tue, 19 Aug 2003, Don Lewis wrote:

> I set the port setup in minicom to 57600, and got this
> on connect:
> 
> 
> atdt4857440
> CONNECT 29333/ARQ/V90/LAPM/V42BIS
> 
> 
> and when I hung up the current settings looked like this:
> 
> ati4
> U.S. Robotics 56K Voice INT Settings...
> 
>B0  E1  F1  L1  M0  Q0  V1  X4  Y0
>BAUD=57600  PARITY=N  WORDLEN=8
>DIAL=TONEON HOOK   CID=0

> changed to 115200 in mincom port setup and got:
> 
> atdt4857440
> CONNECT 34666/ARQ/V90/LAPM/V42BIS

> With 230400, it didn't change:
> 
> atdt4857440
> CONNECT 34666/ARQ/V90/LAPM/V42BIS
> 
> Forced &B=0 and got
> 
> AT&B0
> OK
> atdt4857440
> CONNECT 36000/ARQ/V90/LAPM/V42BIS
> 
> which I guess is some improvement, but doesn't account for the
> huge discrepancies in actual download times/rates
> I am getting.

The speed reported with the CONNECT message is the modem to modem speed,
which depends on the how well the modems are able to cope with the
conditions of the phone connection.  It is independent of the DTE speed
that you use to connect to the modem.  It seems to vary somewhat from
call to call in this case, and it is fast enough that it shouldn't be
what is limiting your speed.

The DTE speed reported by ATI4 seems to match the speed that you are
using to connect to the modem, so if you have the speed in ppp.conf
cranked up, it should not be the limiting factor.

> I saved to NVRAM and exited. I adjusted the connect speed in ppp.conf
> to 115200. I rebooted and when ppp came up, no joy.  Ppp.log
> looked much the same, but it doesn't log the whole CONNECT line.
> 
> I really think this is a software thing, since all went well when
> I had a linux partition and did ppp from linux.

I'm inclined to agree.

> As I said, someone solved this for me before, but my records of it
> got lost in deleting the linux partion and moving freebsd and my
> data to the big disk.   A frantic search of wetware dimly suggests
> that the previous solution involved adding a couple of lines to
> a file one of which included "nobsdcomp" but that file wasn't
> in /etc/ppp, but was someplace really weird - maybe something to
> do with kernel mod loading.  That solution worked in 4.x-STABLE
> but I am now running 5.0-x RELEASE.
> 

The bsdcomp option is documented in the pppd man page.  The nobsdcomp
option disables it.  It is probably not supported by your ISP, but the
ppp startup negotiations should take care of this.  Other things that
might cause problems are the other flavors of compression, Van Jacobson
style TCP/IP header compression, and asyncmap.

Your comment about kernel modules sounds like the netgraph ppp module.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Modem won't connect at full speed

2003-08-19 Thread Don Lewis
On 19 Aug, Lars Eighner wrote:
> On Tue, 19 Aug 2003, Don Lewis wrote:
> 
>> On 19 Aug, Lars Eighner wrote:
>> > This has been answered by somebody on some forum, but I lost it.
> 
>> If you manually dial using cu or tip, what connection speed does the
>> modem report?
> 
> Oh geez, pull out the manual and spend all night trying to figure
> out how to config cu or tip, whatever they are!

You should be able to do the same with minicom.  While on-hook, just
do

ATDT your_isps_phone_number

>> >BAUD=9600  PARITY=N  WORDLEN=8
>> >DIAL=TONEON HOOK   CID=0
> 
>> It is somewhat worrysome that your modem is reporting 9600 BAUD in the
>> fixed DTE speed setting.  I don't know about USR Internal modems, but at
>> least some implementations will pace the data flow rate to the reported
>> DTE speed to avoid overwhelming the host with quick bursts of
>> interrupts.  This might be the reason for your slow connection speeds.
> 
> Hmmm.  Of course this is "on hook."  I obtained these values using
> minicom.  I don't know how to talk to the modem when it is connected
> to somewhere, or more to the point how to query it.  I assumed the
> 9600 here was just the default for talking to the serial port, not
> the pass-through.

Because your modem is configured with &B1, the DTE baud rate won't
change when the modem connects.  If you manually dial with ATDT, the
CONNECT response should indicate the modem to modem connection speed
because you have extended response codes enabled (X4).  If it reports a
high connection speed, then you can eliminated a poor connection and
certain modem configuration problems as the cause of the problem.

If this was a serial modem the DTE speed wouldn't change if the
connection is initiated by the host.  If your internal modem attempts to
emulate the instantaneous I/O speed of a serial modem, then the fact
that it is locked at 9600 BAUD will severely limit the potential data
rate.

It's also possible that the modem is reporting 9600 BAUD just because
that is the speed that you are connecting to it with minicom.  You
should probably check the speed stored in NVRAM with the ATI5 command.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Modem won't connect at full speed

2003-08-19 Thread Don Lewis
On 19 Aug, Lars Eighner wrote:
> This has been answered by somebody on some forum, but I lost it.
> 
> I have an internal hardware modem.  I have configured ppp on demand.
> Unfortunately, my 56k modem connects at an effective rate of about
> 14.4k when I use ppp under FreeBSD.  I have got normal connect speeds
> with the same modem and the same ISP using Linux in the past.
> 
> Here is my ppp.log.  Notice the CCP rejection line, which I
> guess is pertinent:
> 
> Aug 19 20:02:59 pearl ppp[190]: tun0: Phase: bundle: Establish
> Aug 19 20:02:59 pearl ppp[190]: tun0: Phase: deflink: closed -> opening
> Aug 19 20:02:59 pearl ppp[190]: tun0: Phase: deflink: Connected!
> Aug 19 20:02:59 pearl ppp[190]: tun0: Phase: deflink: opening -> dial
> Aug 19 20:02:59 pearl ppp[190]: tun0: Chat: Phone: 4857440
> Aug 19 20:02:59 pearl ppp[190]: tun0: Chat: deflink: Dial attempt 1 of 1
> Aug 19 20:02:59 pearl ppp[190]: tun0: Chat: Send: ATZ^M
> Aug 19 20:02:59 pearl ppp[190]: tun0: Chat: Expect(5): OK
> Aug 19 20:02:59 pearl ppp[190]: tun0: Chat: Received: ATDT4857440^MATZ^M^M
> Aug 19 20:02:59 pearl ppp[190]: tun0: Chat: Received: OK
> Aug 19 20:02:59 pearl ppp[190]: tun0: Chat: Send: ATDT4857440^M
> Aug 19 20:03:01 pearl ppp[190]: tun0: Chat: Expect(40): CONNECT
> Aug 19 20:03:01 pearl ppp[190]: tun0: Chat: Received: ^M
> Aug 19 20:03:18 pearl ppp[190]: tun0: Chat: Received: ATDT4857440^M^M
> Aug 19 20:03:18 pearl ppp[190]: tun0: Chat: Received: CONNECT

If you manually dial using cu or tip, what connection speed does the
modem report?


> Here are my modem hardware setting (with explanations in square
> brackets):

> 
>BAUD=9600  PARITY=N  WORDLEN=8
>DIAL=TONEON HOOK   CID=0
> 
>&A3  &B1  &C1  &D2  &G0  &H1  &I0  &K1
> [&An   n=0  Disable /ARQ Result Codes
>n=1  Enable /ARQ Result Codes
>n=2  Enable /Modulation Codes
>   *n=3  Enable /Extra Result Codes
>  &Bn   n=0  Floating DTE Speed
>   *n=1  Fixed DTE Speed
>n=2  DTE Speed Fixed When ARQ

It is somewhat worrysome that your modem is reporting 9600 BAUD in the
fixed DTE speed setting.  I don't know about USR Internal modems, but at
least some implementations will pace the data flow rate to the reported
DTE speed to avoid overwhelming the host with quick bursts of
interrupts.  This might be the reason for your slow connection speeds.

It's been a while since I've used the proper incantation to reset the
speed on a USR modem, but I think with a serial modem, the procedure was
to connect to the modem, repeatedly type AT to get its attention, put it
in variable speed mode, reconnect at the desired speed, and set &B1 to
get it back to fixed rate.  Probably something similar will work with an
internal modem.  The modem will probably peek at the UART speed control
register to pick up the desired DTE rate and save it to its non-volatile
memory.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: USB versus SMP and Epson printers.

2003-08-12 Thread Don Lewis
On  6 Aug, Frank Mayhar wrote:
> On Monday I received my brand-new Epson C82, a replacement for a 900N with
> a dead print head.  I had already configured CUPS so I imagined that I would
> just hook it up with USB and everything would be happy.
> 
> Well, that's not how it turned out.
> 
> I tried two different machines, both with Tyan dual-CPU motherboards.  One
> is a Thunder 2500 (S1867) with dual PIII 866, my gateway/fax/server
> box and the one I preferred.  The other is my main desktop box, a Tiger
> MPX (2466N-4M) with dual Athlon MP 1900+.  Both displayed essentially
> the same problem, although the Tiger MPX seemed to come a little bit
> closer to working than the Thunder 2500.
> 
> Basically, although usbdevs would show the device, when I tried to do,
> say, an 'escputil -s -r /dev/ulpt0' (to show the ink levels), the process
> would seem to send something to the printer (I say "seem to" because I
> saw no evidence of it on the printer side), then sit in the USB code
> forever, timing out and looping.

Unless someone snuck it in while I wasn't looking, our ulpt
implementation doesn't support reading data from the printer, so it's
not possible to check the ink levels.  I've had to boot Linux in order
to do this.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: USB versus SMP and Epson printers.

2003-08-07 Thread Don Lewis
On  7 Aug, Frank Mayhar wrote:
> Don Lewis wrote:
>> Unless someone snuck it in while I wasn't looking, our ulpt
>> implementation doesn't support reading data from the printer, so it's
>> not possible to check the ink levels.  I've had to boot Linux in order
>> to do this.
> 
> Hmm.  Okay...  Unfortunately, the straight printing didn't work, either.  I
> tried the "check the ink levels" trick only after my test page never printed.
> I'm using CUPS, could this be a limitation of the ulpt driver?  Should I be
> using another device?

I use a laser printer for most of my printing, so I only use my Epson
Photo 890 when I need to print color.  I never bothered to set up print
spooling for it, and just point ghostscript at it.  One problem I ran
into is that anything I attempt to print after a power-on gets turned
into garbage that prints a few funky-looking characters at the top the
page and then ejects the page unless I first run "escputil -n -u -r
/dev/ulpt0", which seems to send a magic escape sequence to the printer
that puts it in the proper mode.  I haven't had a time to investigate
further.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: open() and ESTALE error

2003-06-30 Thread Don Lewis
On 29 Jun, I wrote:
> On 22 Jun, Andrey Alekseyev wrote:

>> Name cache can be purged by nfs_lookup(), if the latter finds that the
>> capability numbers doesn't match. In this case, nfs_lookup() will send a
>> new "lookup" RPC request to the server. Name cache can also be purged from
>> getnewvnode() and vclean(). Which code does that for the above scenario
>> it's quite obscure to me. Yes, my knowledge is limited :)
> 
> The vpid == newvp->v_id test in nfs_lookup() just detects if the vnode
> that the cache entry pointed to was recycled for another use while it
> was on the free list.  It doesn't detect whether the inode on the server
> was recycled.
> 
> When I was thinking about this problem, the solution I came up with was
> a lot like the
>   if (!VOP_GETATTR(newvp, &vattr, cnp->cn_cred, td)
> && vattr.va_ctime.tv_sec == VTONFS(newvp)->n_ctime)
> code fragment, but I would have done the ctime check on both the target
> and the parent directory and only ignored the cache entry if both ctimes
> had been updated.  Checking only the target should be more conservative,
> though it would be slower because there would be more cases where the
> client would have to do the RPC call.

I actually meant to say the mtime of the parent directory.

After doing some more testing, I believe the problem I'm seeing is
caused by the rename on the server not updating the seconds field of the
file ctime.  If the file was last changed at time N, if the client does
a lookup on the file and sees this ctime value, and the server renames
the file before the time on the server increments to the next second,
the ctime check nfs_lookup() won't detect that the cached lookup
information might be invalid.

The best way I could think of to fix this problem is to ignore the cache
entry and do the lookup RPC until we detect that the time on the server
has incremented to the next second, so that we know that the cached
lookup must be valid.  The problem is that I don't know how to get a
timestamp from the server.


>> I've also done a number of tcpdump's for different test patterns and I
>> believe, what happens with the cached vnode may depend on the results of
>> the "access" RPC request to the server.
> 
> That may be an important clue.  The access cache may be properly
> working, but the attribute cache timeout may be broken.

I'm pretty sure that the problem that you are having with open()
returning ESTALE is caused by the difference between the access cache
timeout and the attribute cache timeout.  It looks like your workaround
of retrying the open only works with NFSv3 because NFSv2() relies on
VOP_GETATTR(), and if the attribute cache timeout is too long the open()
will succeed and you'll only detect the failure when you actually do the
I/O.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: open() and ESTALE error

2003-06-29 Thread Don Lewis
On 22 Jun, Andrey Alekseyev wrote:
> Don,
> 
>> When a file is renamed on the server, its file handle remains valid.
> 
> Actually I was wrong in my assumption on how names are purged from the
> namecache. And I didn't mean an operation with a file opened on the client.
> And it actually happens that this sequence of commands will get you ENOENT
> (not ESTALE) on the client because of a new lookup in #4:
> 
> 1. server: echo "" > 1
> 2. client: cat 1
> 3. server: mv 1 2
> 4. client: cat 1  <--- ENOENT here

That's what it is supposed to do, but my testing would seem to indicate
that step 4 could return the file contents for an extended period of
time after the file was renamed on the server.

> Name cache can be purged by nfs_lookup(), if the latter finds that the
> capability numbers doesn't match. In this case, nfs_lookup() will send a
> new "lookup" RPC request to the server. Name cache can also be purged from
> getnewvnode() and vclean(). Which code does that for the above scenario
> it's quite obscure to me. Yes, my knowledge is limited :)

The vpid == newvp->v_id test in nfs_lookup() just detects if the vnode
that the cache entry pointed to was recycled for another use while it
was on the free list.  It doesn't detect whether the inode on the server
was recycled.

When I was thinking about this problem, the solution I came up with was
a lot like the
if (!VOP_GETATTR(newvp, &vattr, cnp->cn_cred, td)
&& vattr.va_ctime.tv_sec == VTONFS(newvp)->n_ctime)
code fragment, but I would have done the ctime check on both the target
and the parent directory and only ignored the cache entry if both ctimes
had been updated.  Checking only the target should be more conservative,
though it would be slower because there would be more cases where the
client would have to do the RPC call.

If the file on the server associated with the cached entry on the client
is renamed on the server, its file handle will remain valid, but its
ctime will be updated, so VOP_GETATTR() will succeed, but the ctime
check should be activated and the cache entry purged.

If the file on the server is unlinked or another file mv'ed on top of
it, its file handle should no longer be valid, so the VOP_GETATTR() call
should fail, which should cause the cache entry to be purged and a new
lookup RPC should be done.

What I find interesting is that in order for for open() to fail with the
ESTALE error, the cache entry must be used, which means that this
VOP_GETATTR() call must be succeeding, but for some reason another VOP
call after namei() returns is failing with ESTALE.

>> Here's the output of the script:
>> 
>> #!/bin/sh -v
>> rm -f file1 file2
>> ssh -n mousie rm -f file1 file2
>> echo foo > file1
>> echo bar > file2
>> ssh -n mousie cat file1
>> foo
>> ssh -n mousie cat file2
>> bar
>> tail -f file1 &
>> sleep 1
>> foo
>> cat file1
>> foo
>> cat file2
>> bar
>> ssh -n mousie 'mv file1 tmpfile; mv file2 file1; mv tmpfile file2'
>> cat file1
>> bar
>> cat file2
>> foo
>> echo baz >> file2
>> sleep 1
>> baz
>> kill $!
>> Terminated
>> ssh -n mousie cat file1
>> bar
>> ssh -n mousie cat file2
>> foo
>> baz
>> 
>> Notice that immediately after the files are swapped on the server, the
>> cat commands on the client are able to immediately detect that the files
>> have been interchanged and they open the correct files.  The tail
>> command shows that the original handle for file1 remains valid after the
>> rename operations and when more data is written to file2 after the
>> interchange, the data is appended to the file that was formerly file1.
> 
> By the way, what were the values of acregmin/acregmax/acdirmin/acdirmax and
> also the value of vfs.nfs.access_cache_timeout in your tests?

I'm using the the default values for
acregmin/acregmax/acdirmin/acdirmax.

% sysctl vfs.nfs.access_cache_timeout 
vfs.nfs.access_cache_timeout: 2

> I believe, the results of your test patterns heavily depend on the NFS
> attributes cache tunables (which happen to affect all cycles of NFS
> operation) and on the command execution timing as well. Moreover, I'm
> suspect that all this is badly linked with the type and sequence of
> operations on both the server and the client. Recall, I was about to fix
> just *one* common scenario :)

Some of my test cases waited for 120 seconds after the rename on the
server before attempting access from the client, which should be enough
time for the attribute cache to time out.

> With different values of acmin/acmax and access_cache_timeout, and manual
> operations I was able to achieve the result you consider as "proper" above
> and also, the "wrong" effect that you described below.
> 
>> And its output:
>> 
>> #!/bin/sh -v
>> rm -f file1 file2
>> ssh -n mousie rm -f file1 file2
>> echo foo > file1
>> echo bar > file2
>> ssh -n mousie cat file1
>> foo
>> ssh -n mousie cat file2
>> bar
>> sleep 1
>> cat file1
>> foo
>> cat file2
>> bar
>> ssh -n mousie 'mv file1 file2'
>> cat file2
>> foo
>> 

Re: open() and ESTALE error

2003-06-21 Thread Don Lewis
On 21 Jun, Andrey Alekseyev wrote:
> Don,
> 
>> old vnode and its associated file handle.  If the file on the server was
>> renamed and not deleted, the server won't return ESTALE for the handle
> 
> I'm all confused and messed up :)  Actually, a rename on the server is not
> the same as sillyrename on the client.  If you rename a file on the
> server for which there is a cached file handle on the client, next time
> the client will use its cached file handle, it'll get ESTALE from the server.
> I don't know how this happens, though. Until I dig more around all the
> rename paraphernalia, I won't know. If someone can clear this out, please
> do. It'll be much appreciated. At this time I can't link this with the
> inode generation number changes (as there is no new inode allocated when
> the file is renamed).

When a file is renamed on the server, its file handle remains valid.


I had some time to write some scripts to exercise this stuff and
discovered some interesting things.  The NFS server is a 4.8-stable box
named mousie, and the NFS client is running 5.1-current.  The tests were
run in my NFS-mounted home directory.

Here's the first script:

#!/bin/sh -v
rm -f file1 file2
ssh -n mousie rm -f file1 file2
echo foo > file1
echo bar > file2
ssh -n mousie cat file1
ssh -n mousie cat file2
tail -f file1 &
sleep 1
cat file1
cat file2
ssh -n mousie 'mv file1 tmpfile; mv file2 file1; mv tmpfile file2'
cat file1
cat file2
echo baz >> file2
sleep 1
kill $!
ssh -n mousie cat file1
ssh -n mousie cat file2

Here's the output of the script:

#!/bin/sh -v
rm -f file1 file2
ssh -n mousie rm -f file1 file2
echo foo > file1
echo bar > file2
ssh -n mousie cat file1
foo
ssh -n mousie cat file2
bar
tail -f file1 &
sleep 1
foo
cat file1
foo
cat file2
bar
ssh -n mousie 'mv file1 tmpfile; mv file2 file1; mv tmpfile file2'
cat file1
bar
cat file2
foo
echo baz >> file2
sleep 1
baz
kill $!
Terminated
ssh -n mousie cat file1
bar
ssh -n mousie cat file2
foo
baz

Notice that immediately after the files are swapped on the server, the
cat commands on the client are able to immediately detect that the files
have been interchanged and they open the correct files.  The tail
command shows that the original handle for file1 remains valid after the
rename operations and when more data is written to file2 after the
interchange, the data is appended to the file that was formerly file1.

My second script is an attempt to reproduce the open() -> ESTALE error.

#!/bin/sh -v
rm -f file1 file2
ssh -n mousie rm -f file1 file2
echo foo > file1
echo bar > file2
ssh -n mousie cat file1
ssh -n mousie cat file2
sleep 1
cat file1
cat file2
ssh -n mousie 'mv file1 file2'
cat file2
cat file1

And its output:

#!/bin/sh -v
rm -f file1 file2
ssh -n mousie rm -f file1 file2
echo foo > file1
echo bar > file2
ssh -n mousie cat file1
foo
ssh -n mousie cat file2
bar
sleep 1
cat file1
foo
cat file2
bar
ssh -n mousie 'mv file1 file2'
cat file2
foo
cat file1
cat: file1: No such file or directory

Even though file2 was unlinked and replaced by file1 on the server, the
client immediately notices the change and is able to open the proper
file.


Since my scripts weren't provoking the reported problem, I wondered if
this was a 4.x vs. 5.x problem, or if the problem didn't occur in the
current working directory, or if the problem only occurred if a
directory was specified in the file path.  I modified my scripts to work
with a subdirectory and got rather different results:

#!/bin/sh -v
rm -f dir/file1 dir/file2
ssh -n mousie rm -f dir/file1 dir/file2
echo foo > dir/file1
echo bar > dir/file2
ssh -n mousie cat dir/file1
foo
ssh -n mousie cat dir/file2
bar
tail -f dir/file1 &
sleep 1
foo
cat dir/file1
foo
cat dir/file2
bar
ssh -n mousie 'mv dir/file1 dir/tmpfile; mv dir/file2 dir/file1; mv dir/tmpfile 
dir/file2'
sleep 120
cat dir/file1
bar
cat dir/file2
bar
echo baz >> dir/file2
sleep 1
kill $!
Terminated
ssh -n mousie cat dir/file1
bar
baz
ssh -n mousie cat dir/file2
foo

Even after waiting long enough for the cached attributes to time out,
the one of cat commands on the client opened the incorrect file and when
the shell executed the echo command to append to one of the files, the
wrong file was opened and appended to.  Conclusion, the client is
confused and retrying open() on an ESTALE error is insufficient to fix
the problem.

By specifying a directory in the path, I'm was also able to reproduce
the ESTALE error one time, but now I always get:

#!/bin/sh -v
rm -f dir/file1 dir/file2
ssh -n mousie rm -f dir/file1 dir/file2
echo foo > dir/file1
echo bar > dir/file2
ssh -n mousie cat dir/file1
foo
ssh -n mousie cat dir/file2
bar
sleep 1
cat dir/file1
foo
cat dir/file2
bar
ssh -n mousie 'mv dir/file1 dir/file2'
sleep 120
cat dir/file2
foo
cat dir/file1
foo

unless I decrease the sleep time:

#!/bin/sh -v
rm -f dir/file1 dir/file2
ssh -n mousie rm -f dir/file1 dir/file2
echo foo > dir/file1
echo bar > dir/file2
ssh -n mousie cat dir/file1
foo
ssh -n mousie cat dir/file

Re: open() and ESTALE error

2003-06-20 Thread Don Lewis
On 20 Jun, Andrey Alekseyev wrote:
>> Eh, but the generation number for file1 should have been changed! This will
> 
> I'm sorry, the generation number is not changed in your scenario. Thus,
> I believe if the sequence of actions on the server is 
> 
> mv file1 tmpfile
> mv file2 file1
> mv tmpfile file1
> 
> like you described, it's safe to continue to use a cached file handle
> for file1 on the server since it still references the original file.
> And file2 just disappears from the server.

Well just its contents ... but this still violates POLA.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: open() and ESTALE error

2003-06-20 Thread Don Lewis
On 20 Jun, Andrey Alekseyev wrote:
> Don,
> 
>> One case where there is a difference between timing out old file handles
>> and just invalidating them on ESTALE:
> 
> Frankly, I just didn't find any mechanism in the STABLE kernel that
> does "timing out" for file handles. Do you mean, it would be nice to have
> it or are you trying to point it out to me? ;-P

If there isn't such a mechanism, there should be.

>> client%  cmd1 > file1; cmd2 > file2
>> server% mv file1 tmpfile; mv file2 file1; mv tmpfile file1
>> 
>> wait an hour
>> 
>> client% cat /dev/null > file1
>> 
>> If file handles are cached indefinitely, and the client didn't recycle
>> the vnode for file1, which file on the server got truncated?  Since
>> neither file was deleted on the server, you can't rely on ESTALE to
>> detect this situation.
> 
> Eh, but the generation number for file1 should have been changed! This will
> result in a definite ESTALE error for file1 from the server. That is, I
> believe that if you attempt to open("file1", O_CREAT) after an hour, you'll
> get ESTALE from the server (on which nfs_request() will invalidate "file1"
> namecache entry and vnode+nfsnode+old-file-handle) and the second vn_open()
> will re-lookup file1 and get a valid new file handle.

If the client still has a cached copy of the file handle for file1,
won't it just use that and truncate file2 on the server?  The handle
never doesn't stale because the file was never deleted on the server.

> Actually, this is what indeed happens if the second open() comes from the
> userland application :)  I'm just trying to eliminate the need of modifying
> a generic application.
> 
> For my example with moves, the next "cat" will always(!) succeed.
> 
>> Question: does the timeout of the directory attributes cause open() do
>> do an NFS lookup on the file, or does open() just find the vnode in the
>> cache and use its cached handle?
> 
> Well, for open() without O_CREAT the sequence is this:
> open() -> vn_open() -> namei() -> lookup() -> VOP_LOOKUP() -> nfs_lookup()
>   |
>   VOP_ACCESS() -> nfs_access() [ -> nfs3_access_otw() ]
>   |
>   VOP_OPEN() -> nfs_open()
> 
> Lookup is always done first (obviously). It may return cached name which
> contains a pointer to a cached vnode/nfsnode. Cached vnode/nfsnode is used
> further in VOP_ACCESS() and VOP_OPEN(). Either function may or may not
> update file attributes cached inside nfsnode. Neither VOP_ACCESS() or
> VOP_OPEN() ever updates the *file handle*. File handle comes from
> VOP_LOOKUP().  And VOP_LOOKUP() only places it there if vnode/nfsnode isn't
> cached.  Which I believe happens only if there is no cached filename in
> the namecache. I really tried to do my best to describe everything in:
> http://www.blackflag.ru/patches/nfs_attr.txt
> Please take a look.

If the client is mostly idle, then the cached filename is unlikely to be
flushed, so even after a long period of time, namei() will return the
old vnode and its associated file handle.  If the file on the server was
renamed and not deleted, the server won't return ESTALE for the handle
and open() will return a descriptor for the original file on the server
that has since been renamed, not for the new file on the server that
lives at the path name passed to open() on the client.

Another example:

client% cmd1 > file1
client% cmd2 > file2
client% more file1
^Z
suspended

server% mv file1 tmpfile; mv file2 file1; mv tmpfile file2

wait 24 hours

client% cat /dev/null > file1
client% fg

The last cat comand should truncate file1 on the server, which is the
output of cmd2.  When the more command resumes, it should still be able
to able to see the output of cmd1.  The old file1 vnode and file handle
should remain valid, but the lookup to open file1 for the last cat
command needs to know that the cache entry has timed out and that the
handle associated with the cached vnode for file1 hasn't been validated
in a while.  Lookup() needs to bypass the cache in the case and pass the
lookup request to the server.  If the file handle returned is the same
as before, the cache entry should be freshened, if the file handle is
different then a new vnode needs to be allocated and associated with the
name cache entry and the new handle.  The old vnode and its handle need
to be retained until either an rpc using this handle returns ESTALE, or
the the file is closed and the vnode is recycled.


> Whether ESTALE came from VOP_ACCESS() or VOP_OPEN() depends on several
> factors. Namely, the value of nfsaccess_cache_timeout sysctl, acmin/acmax
> and the age of the file in question.
> 
> Generally speaking, if nfsaccess_cache_timeout is less than acmin,
> VOP_ACCESS() that comes right before VOP_OPEN() in vn_open() will try to do
> an "access" RPC request and it'll fail if the file handle is stale. If
> nfsaccess_cache_timeout is greater than acmin, than it's possible that
> VOP_ACCESS() will answer "yes" basing on the cached at

Re: open() and ESTALE error

2003-06-20 Thread Don Lewis
On 20 Jun, Andrey Alekseyev wrote:

> In the normal situation, namecache entry+vnode+nfsnode+file handle may
> stay cached for a really long time (until re-used? deleted or renamed
> on the *client*). Expiring file handles (a new mechanism?) means much the
> same to me as simply obtaining a new name cache entry+other data
> on ESTALE :) I may be wrong, though.

One case where there is a difference between timing out old file handles
and just invalidating them on ESTALE:

client% cmd1 > file1; cmd2 > file2
server% mv file1 tmpfile; mv file2 file1; mv tmpfile file1

wait an hour

client% cat /dev/null > file1

If file handles are cached indefinitely, and the client didn't recycle
the vnode for file1, which file on the server got truncated?  Since
neither file was deleted on the server, you can't rely on ESTALE to
detect this situation.

Question: does the timeout of the directory attributes cause open() do
do an NFS lookup on the file, or does open() just find the vnode in the
cache and use its cached handle?
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: open() and ESTALE error

2003-06-19 Thread Don Lewis
On 19 Jun, Andrey Alekseyev wrote:
> Hello,
> 
> I've been trying lately to develop a solution for the problem with
> open() that manifests itself in ESTALE error in the following situation:
> 
> 1. NFS server: echo "" > file01
> 2. NFS client: cat file01
> 3. NFS server: echo "" > file02 && mv file02 file01
> 4. NFS client: cat file01 (either old file01 contents or ESTALE)
> 
> My study shows that actually the problem appears to be in VOP_ACCESS()
> which is called from vn_open(). If nfs_access() decides to "go to the wire"
> in #4, it then uses a cached file handle which is indeed stale. Thus,
> open() eventually fails with ESTALE too (ESTALE comes from underlying
> nfs_request()).
> 
> I understand all the fundamental NFS-related integrity problems, but not
> this one :) That is, I see no reason for open() to fail to open a file for
> reading or writing if the system knows the problem is it's own. Why not
> just do another lookup and try obtain a valid file handle?
> 
> I was playing with different parts of the kernel while "fixing" this for
> myself. However, I believe, the simpliest patch would be for
> vfs_syscalls.c:open() (I've also made a working patch against vn_open(),
> though).
> 
> Could anyone please be so kind to comment this issue?
> 
> TIA
> 
> --- kern/vfs_syscalls.c.orig  Thu Jun 19 13:22:50 2003
> +++ kern/vfs_syscalls.c   Thu Jun 19 13:29:11 2003
> @@ -1008,6 +1008,7 @@
>   int type, indx, error;
>   struct flock lf;
>   struct nameidata nd;
> + int stale = 0;
>  
>   oflags = SCARG(uap, flags);
>   if ((oflags & O_ACCMODE) == O_ACCMODE)
> @@ -1025,8 +1026,15 @@
>* the descriptor while we are blocked in vn_open()
>*/
>   fhold(fp);
> +again:
>   error = vn_open(&nd, flags, cmode);
>   if (error) {
> + /*
> +  * if the underlying filesystem returns ESTALE
> +  * we must have used a cached file handle.
> +  */
> + if (error == ESTALE && stale++ == 0)
> + goto again;
>   /*
>* release our own reference
>*/

I can't get very enthusiastic about changing the file system independent
code to fix a deficiency in the NFS implementation.

If the name of the file are you attempting to open is relative to your
current working directory, and your current working directory is nuked
on the server, vn_open will return ESTALE, and your patch above will
loop forever.

NFS really doesn't work very well if modifications are make by both a
client and the server, or by multiple clients.  Solaris attempts to
compensate with a mount option:
   noac  Suppress data and attribute  caching.  The  data
 caching  that is suppressed is the write-behind.
 The local page cache is  still  maintained,  but
 data  copied  into  it is immediately written to
 the server.


If the rename on the server was done within the attribute validity time
on the client, vn_open() will succeed even without your patch, but you
may encounter the ESTALE error when you actually try to read or write
the file.

Unless you have some sort of locking protocol or other way of
synchronizing this sequence of operations on the client and server, the
server could do the rename while the client has the file open, after
which some I/O operation on the client will encounter ESTALE.

If the problem is that open() is failing a long time after the server
did the rename, then the best solution may be for the client to time out
file handles more aggressively.  If the vnode on the client is closed,
the file handle could be timed out after acregmin/acregmax or
acdirmin/acdirmax, or a new handle timeout parameter.  This may decrease
performance, but nothing is free ...
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: floppy.. Was: Drawing graphics on terminal

2003-06-16 Thread Don Lewis
On 16 Jun, Juli Mallett wrote:

> Not to turn this into too much of a bikeshed, but here's an idea I
> jotted down a while ago:
> 
> %%%
> There has been a lot of talk about deprecation of floppies in upcoming
> releases, and I've been thinking a lot about whether or not we need to
> do this, and I've been thinking especially about when it makes sense
> to have the installer at all, and have come up with three cases, and
> how a floppy would fit in to them.  This is intended to help come up
> with ways of having single-purpose floppies that are easier to keep
> small enough to fit on, well, floppies.

I've thought for a long time that this is the best way to go.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: gcc problem/openoffice failure

2003-05-27 Thread Don Lewis
On 27 May, M. Warner Losh wrote:
> In message: <[EMAIL PROTECTED]>
> "Bruce R. Montague" <[EMAIL PROTECTED]> writes:
> : 
> :  Julian Elischer wrote:
> : 
> :  > ... I have not been able to compile the openoffice port ...
> : 
> :  > ... Has anyone else seen this?
> : 
> :  
> :  I tried to build openoffice on a "clean" -current system,
> : built from a recent cvsup, and it failed to compile... This
> : was perhaps a week and a half ago, kept meaning to get back
> : and look at it, but time seems to have got the best of me.
> 
> I wouldn't attempt something this complex without portupgrade...

That reminds me ... the openoffice port ignores non-zero exit status in
too many places.  More than once I've had the install phase (and maybe
even the build phase fail which then proceeded to the install phase
which then failed), but the the exit status was ignored, make exited
with a zero status, and portupgrade thought that the installation
succeeded and then nuked the working backup copy of openoffice and did a
make clean. A build from scratch takes more than 24 hours on my -stable
box ...
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: gcc problem/openoffice failure

2003-05-27 Thread Don Lewis
On 27 May, Julian Elischer wrote:
> 
> For the last month (more actually)(and after completely rebuilding my
> system and all the ports on it) I have not been able to compile 
> the openoffice port due to gcc failures. 
> (I have posted the message earlier several times)
> Has anyone been able to compile the openoffice port recently?
> 
> somewhere in it's private compilation of mozilla (why does it do that? I
> already have mozilla running?) gcc (doing c++) has a heart attach and
> keels over dead.

I think the reason to make the build take longer.  Lots of fun on 400
MHz PII.

> This has been reproducible for me for at least a month and probably
> more. 
> 
> Has anyone else seen this?
> Is the openoffice port working for everyone else?
> (on FreeBSD 4.8++) (4.8-RELEASE had the same problem for me)

I was able to build it on April 22.  I don't know when the previous
cvsup and buildworld was.  My kernel.old dates to April 25th, so I don't
have any history before that time.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Mount option "nomtime"?

2002-10-08 Thread Don Lewis

On  8 Oct, Oliver Fromme wrote:

> I should have been more specific in my examples.  Sorry.
> 
> Think about INN with using cycbuffers (CNFS) when storing
> news articles (which is pretty standard on fullfeed news
> servers).  Those cycbuffers are a bunch of large files.
> Their size never changes, but a lot of data is written to
> them all the time.  The same goes for the overview data
> when using the so-called buffindexed storage.  INN itself
> does not need the mtime information of the buffer files.
> 
> Another example would be "oops", which is a very fast,
> lightweight web proxy.  It uses cyclic buffer files to
> store the cached data, similar to INN's CNFS.
> 
> I think in the above cases, a "nomtime" option would indeed
> save some unnecessary overhead.

Probably not much, especially if you are using soft updates.  The
in-kernel copy of the inode will get updated on every write, but the
on-disk copy will only get written when the soft updates timer for it
goes off, which I think would be once every 10 seconds and is tunable. I
don't think you'll see much reduction in load compared to all the other
I/O that's going on.

Noatime won't help much in your examples either.  It only buys you a lot
if the data is spread over a large number of files.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: USB support for new HP printers?

2002-08-26 Thread Don Lewis

On 26 Aug, Mikko Työläjärvi wrote:
> On Mon, 26 Aug 2002, Terry Lambert wrote:
> 
>> John Nielsen wrote:

>> > "FreeBSD, NetBSD, and OpenBSD are not yet supported in USB mode, due to
>> > missing functionality in the kernel "ulpt" driver (bidirectional I/O,
>> > device ID retrieval, switching to 7/1/3, and HP channel-change-request)."
>>
>>
>> I'm pretty sure that bidirectional I/O is supported, or there
>> would be no network devices.
> 
> [...]
> 
> Though the USB stack handles bi-directional communication, ulpt does not:
> 
> Static struct cdevsw ulpt_cdevsw = {
>/* open */ulptopen,
>/* close */   ulptclose,
>/* read */noread,  < !
> ...

I'd like to have bi-directional communication support in the ulpt driver
for my Epson ink jet printer so that I could use the utilities to check
the ink levels and align the nozzles.  This feature is on the ToDo list
on the FreeBSD USB project page.  I looked at the existing driver code a
while back, but that's as far as I got :-(


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Memory corruption in CURRENT

2002-08-22 Thread Don Lewis

On 22 Aug, Terry Lambert wrote:

> Alternatively, rather than those options, try losing 512M of
> the RAM... I note they are all 1G boxes.

When I first put this system together several months ago, I only
installed the first 512M of RAM and the problem was much worse.  I only
had about a 50% chance of getting a successful buildworld.  The problem
seemed to go away shortly thereafter, and though I wasn't sure whether
the problem was caused by hardware or software, I attributed the
improvement to an upgrade to a newer version of -current.  Since then
I've replaced the motherboard (the old one didn't support ECC), the disk
and controller (I needed more space, and the new disk consumes a lot
less power than the two that it replaced, plus it doesn't sound like a
dental drill), and the power supply (because I was concerned that the
original might be marginal).  I also added an extra intake fan on the
front of the case.

At the moment I'm running a set of buildworlds with an August 6th
kernel, just to verify the problem that I'm seeing isn't something new.
When I'm done with that, I'll reduce the RAM from 1G to 512M and try
again.  I'll also try the DISABLE_PSE and DISABLE_PG_G options.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Memory corruption in CURRENT

2002-08-22 Thread Don Lewis

On 22 Aug, Soeren Schmidt wrote:

> However, this kind of problem in most cases spells bad HW to me,
> ie subspec RAM, poor powersupply, badly cooled CPU, overclocking etc etc...

My motherboard chipset supports ECC RAM and I have ECC RAM installed.  I
upgraded to an expensive Antec power supply that has better specs than
most of the other supplies I looked at.  The system is plugged into a
surge supressor.  I don't currently have an UPS.  I added an extra case
fan and drive cooler fans, and the two failures happened in the evening
after the room cooled off.  For some reason xmbmon doesn't seem to be
working at the moment, but when I looked at the temperatures previously
they seemed to be acceptable.  I don't believe in overclocking.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Memory corruption in CURRENT

2002-08-22 Thread Don Lewis

On 22 Aug, Mark Santcroos wrote:
> On Thu, Aug 22, 2002 at 09:43:45AM +0200, Martin Blapp wrote:

>> Thats memory corruption. I'm also not able anymore
>> to make 10 buildworlds (without -j, that triggers
>> panics in pmap code).
>> 
>> Bye the way, I'm experiencing this since about 4-5 months.
>> 
>> All hackers, please help to track this down.
> 
> Is it P4 specific or not?

Nope.  I'm seeing it on an AMD Athlon XP 1900+.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: offtopic: low level format of IDE drive.

2002-07-09 Thread Don Lewis

On  8 Jul, Peter Wemm wrote:
> Julian Elischer wrote:
>> this is not a 'reformat'
>> 
>> what I want to do is an old-fashionned refomat/verify where the controller
>> writes new track headers etc.
> 
> The thing is, just about all IDE drives more than a few GB or so do 'track
> writing' and have no fixed sectoring or sector positioning.  ie: each time
> you write a single sector to a track, it does a read-modify-write of *THE
> ENTIRE TRACK*.  This is why we have to have write caching turned on for IDE
> drives to get decent performance.  Without it, it essentially rewrites the
> entire track over and over and over again because it cannot fill its write
> buffer in order to write a contiguous block to completely replace what was
> there before.  ie: each track is one giant physical sector with multiple logical
> sectors inside it.
> 
> The really annoying thing is that most newer scsi drives do this too.

How readily available is the information about which drives do this?  As
someone who only buys the occasional drive, I'd rather not have to buy
one and do the evaluation myself using the method mentioned later in
this thread.


> Get a UPS if you value the data. :-]

That doesn't help if the cat knocks a book off the shelf onto the power
switch, or if you trip over the cord between the UPS and the computer,
or if the magic smoke escapes from the computer power supply.




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: 'make release' tries to build a port?

2002-07-02 Thread Don Lewis

On  2 Jul, Brian Reichert wrote:
> I'm mucking with 'make release' under 4.5-RELEASE, and keep running
> into a stumbling block:
> 
> When the documentation toolset is being built in the chrooted
> environment, at one point docbook-dsssl-doc is built, among other
> things, via ports.
> 
> Regrettably, this is hosted on Sourceforge, who've interposed a
> 'pick-a-mirror' webpage when you attempt to do a regular http
> download.
> 
> So, the port build fails, and 'make release' fails, and rerunning
> 'make release' scrubs all of your work.
> 
> I seem to have two options:
> 
> - convince Sourceforge to unbreak themselves, or
> - install these documentation tools on my system, and make them
>   somehow available in the chrooted environment.

Try manually downloading a copy of the distfile into your non-chrooted
/usr/ports/distfiles directory.  "make release" copies all your
distfiles into the chrooted environment.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: getting a coredump before boot device found

2002-07-01 Thread Don Lewis

On 30 Jun, Julian Elischer wrote:
> 
> 
> On Sun, 30 Jun 2002, Charles Sprickman wrote:
> 
>> On Sun, 30 Jun 2002, Julian Elischer wrote:

>> > do you have 2 machines you can link together?
>> 
>> Oh yes.  But I know nothing about remote gdb.
> 
> it's not that difficult..
> 
> have the source and compile directory on the 2nd machine so gdb can access
> it. compile with debug symbols (config -g). Have the following .gdbinit in
> the compile/MYKENEL directory:

What about gdb versions?  Can you use the -stable version of gdb to
debug a machine running current or is it necessary to crossbuild the
-current version of gdb so that it runs on -stable (assuming i386 on
both ends)?


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: fdcheckstd() test bug in execve() (was: Re: Suggested fixes for uidinfo "would sleep" messages)

2002-06-20 Thread Don Lewis

On 20 Jun, Mike Makonnen wrote:
> On Thu, 20 Jun 2002 00:04:41 -0700 (PDT)
> Don Lewis <[EMAIL PROTECTED]> wrote:
> 
>> 
>> Your patch also looks like it should fix the bug.  I prefer my patch,
>> though, because I think the resultant code is structured better and
>> should be easier to understand.  For instance, the reason for the
>> assignment to oldcred in the "if (error != 0)" block in your patch is
>> not immediately obvious.  
> 
> You can remove it, it was part of something else I was working on.

When I looked at it last night, it appeared to be necessary, since if
the fdcheckstd() test fails oldcred will be left pointing to the
credential held by the process.  On further review, I see that this
assignment isn't necessary after all because of the code that calls
crfree() just looks at the state of newcred.

> I haven't taken a look at your patch. I was working on something else
> and already had a patch for it, before I saw yours. I sent it as a 
> reference because there was something in the thread about 
> leaking p_args.

I suprised that things didn't even get more mucked up because the
process was never unlocked.

> I really don't care which patch makes it into the tree. If it solves
> the problem, it solves the problem. There's not much more to it.

Alfred committed yours earlier today.




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



RE: daemon()

2000-11-08 Thread Don Lewis

On Nov 8,  5:06pm, Max Khon wrote:
} Subject: RE: daemon()
} hi, there!
} 
} On Wed, 8 Nov 2000, Koster, K.J. wrote:
} 
} > >   No one with any brains uses bash 1 for anything 
} > > anymore.
} 
} > Then why is it there? To help up the port count? If it's not good, it should
} > be nuked, IMHO.
} 
} people still use it because it is smaller
} obrien has already tried to remove it once (in Mar 1999)
} 
} as for me -- I do not try to hunt bugs in bash1 and do not blame it.
} my question was about unclosed pipe

It appears to be a descriptor that your shell failed to close before
execing your test program.  Unless you do something out of the ordinary
like run
program 27>somefile
the shell should only leave three descriptors (0, 1, and 2 for stdin, stdout,
and stderr) open when it execs another program.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: daemon()

2000-11-07 Thread Don Lewis

On Nov 7, 11:41pm, [EMAIL PROTECTED] wrote:
} Subject: Re: daemon()
} 
} 
} On Tue, 7 Nov 2000, Max Khon wrote:
} 
} > what is FD 4?
} 
} I can't reproduce this? Does it always happen?

It might be something that the shell forgets to close, so it will be
dependent on which shell you use.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: SIOCSPGRP documentation?

2000-09-01 Thread Don Lewis

On Sep 1,  4:55pm, John DeBoskey wrote:
} Subject: SIOCSPGRP documentation?
} Hi,
} 
}Like the subject says, I'm looking for documentation
} on the SIOCSPGRP ioctl call:
} 
} rc = ioctl(port,SIOCSPGRP,&pid);
} 
}I've gone through the source tree and while I can find
} references where it's used, and where the functionality
} is defined, there appears to be no doc (or man page).

Take a look at
man 4 tty
and the controlling terminal information in
man 4 termios


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Redirect stdout/stderr to syslog [OFF-TOPIC]

2000-09-01 Thread Don Lewis

On Sep 1,  3:24pm, Peter Pentchev wrote:
} Subject: Re: Redirect stdout/stderr to syslog [OFF-TOPIC]
} On Fri, Sep 01, 2000 at 02:13:19PM +0200, Alexander Maret wrote:
} > > -Ursprungliche Nachricht-
} > > Von: Peter Pentchev [mailto:[EMAIL PROTECTED]]
} > > Gesendet: Freitag, 1. September 2000 14:00

} > > pipe your stdout/stderr to logger(1), and you're all set.  
} > > You may even
} > > specify a facility/level to log with.
} > > 
} > 
} > Thanks for your quick answer but I would prefer to
} > do it entirely in C without calling external progs. 
} > I could think of a solution forking another child process 
} > which does the syslog logging and redirecting stdout/stderr
} > of the execvped program via IPC to this child.
} > 
} > But is there any easier solution?
} 
} No, I don't think you can do anything cheaper than a fork and
} a pipe(2). popen(), as suggested in another message, is pretty
} much the same.  I don't think stdio has a hook to capture all
} the data a process is sending to a stream, and pass it to some
} routine - that would be perfect, but unfortunately, I am not
} aware of such a thing.  I might be wrong though.

It's not very widely implemented, so any code using it won't be
portable, but take a look at the man page for fuopen(3).


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: How many files can I put in one diretory?

2000-06-22 Thread Don Lewis

On Jun 22,  5:11pm, "Daniel O'Connor" wrote:
} Subject: Re: How many files can I put in one diretory?
} 
} On 22-Jun-00 Luigi Rizzo wrote:
} >  that sounds insane! Because a name is a name, why dont they call
} >  those files xx/yy/zz/tt.html and the like, to get down to a more
} >  reasonable # of files per directory.
} >  
} >  Or use a single file and a cgi which extracts things from the right place.
} >  In such a context, i assume that the best place to do the name lookup
} >  is in the app, not in the kernel.
} 
} Yeah.. This is why databases where invented :)
} 
} FYI 4 in a directory really makes directory listings slow.. 2 million would
} suck :)

Only if directory lookups use a sequential search.  Not all filesystem
implementations sequentially scan directory entries.  Some use btrees or
other ways of quickly finding the desired directory entry.  Even so,
you probably still would want to avoid doing an "ls" or an "echo *" ;-)

I'd recommend looking at how squid stores its disk cache.  It has a
very similar performance problem to solve.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: file creation times ?

2000-05-25 Thread Don Lewis

On May 24,  6:58pm, Arun Sharma wrote:
} Subject: Re: file creation times ?
} On Thu, May 25, 2000 at 11:03:38AM +1000, Peter Jeremy wrote:
} > To put it another way, why _should_ FreeBSD store a file creation time?
} 
} 0. I'm tired of seeing people putting "Created: mm/dd/yy" in their documents.

When saving a document to "file", many editors will do the equivalent of
save document to "file.new"
ln "file" "file.bak"
mv "file.new" "file"
in order to minimize the possibility of losing the document if the editor
or the system crashes at just the wrong time.  The result of this would
be to set the file creation time to the time it was last saved.  This
won't be very helpful if you are relying on the file creation time to
tell you when the *document* was first created.

NFS doesn't support this file timestamp, so you lose if the file is stored
on another server.

The tar archive format doesn't support this timestamp, so a document that
is archived using tar and later restored will lose its notion of when it
was created.

What should the semantics of the creation time be across a backup and
restore?  Should the original creation time be restored, or should the
creation time be the time when the restored copy of the file is written?
What about just copying a file?  If I make an exact copy of a document,
should the two copies have the same or differing creation times?


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: please hellllllllllllp me!

2000-05-23 Thread Don Lewis

On May 22,  3:33pm, Alfred Perlstein wrote:
} Subject: Re: please hep me!
} * David Scheidt <[EMAIL PROTECTED]> [000522 14:30] wrote:

} > dscheidt@shell-2 ~ 536$ ls -al | grep .snapshot
} > dscheidt@shell-2 ~ 537$ ls -al .snapshot
} > total 60
} > drwxrwxrwx   2 root  wheel   4096 May 22 15:01 .
} > drwxr-xr-x  15 dscheidt  dialin  8192 May 22 15:51 ..
} > drwxr-xr-x  15 dscheidt  dialin  8192 May 22 14:58 hourly.0
} > drwxr-xr-x  15 dscheidt  dialin  8192 May 22 13:52 hourly.1
} > drwxr-xr-x  15 dscheidt  dialin  8192 May 22 13:00 hourly.2
} > drwxr-xr-x  15 dscheidt  dialin  8192 May 22 10:52 hourly.3
} > drwxr-xr-x  15 dscheidt  dialin  8192 May 19 16:34 nightly.0
} > drwxr-xr-x  15 dscheidt  dialin  8192 May 19 16:34 nightly.1
} > dscheidt@shell-2 ~ 538$ 
} > 
} > doesn't count then?  This is a directory NFS-mounted from a NetApp.  The
} > .snapshot directory is a lifesaver, and support cost cutter.
} 
} If the netapp doesn't honor readdir requests properly then it's
} breaking unix semantics.
} 
} Netapp is broken, there's no reason to intentionally hide this
} directory from readdir.

It would be really annoying to have to exclude all of these every
time you wanted to roll a tarball of a directory tree.  Also, a lot
of the time you probably won't want find or other recursive things
to wander into these directories.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: NFS server problems on 3.4-S, any interest?

2000-05-23 Thread Don Lewis

On May 22,  1:32pm, Matthew Dillon wrote:
} Subject: Re: NFS server problems on 3.4-S, any interest?
} :>From the workstation:
} :Name  Mtu   Network   Ipkts  IerrsOpktsOerrs Coll  Drop
} :fxp0  150032102492 0  31653667   0   30900   0

30900 collisions is a pretty good clue that fxp0 is not in full-duplex
mode.  In full-duplex mode both NICs are allowed to transmit at the
same time and the collision sensing circuitry is supposed to be turned
off.

I would expect to se Oerrs in this case, though.  This card should
be seing most of the collisions after 1 slot time, which it should
sense as late collions, and I *think* it should count these as Oerrs.

} :>From the fileserver:
} :Name  Mtu   Network   Ipkts  IerrsOpktsOerrs Coll   Drop
} :xl0   1500  3250417328967   32900227   00   0
} :
} : I did find it a little unusual that I was getting collisions on a
} :crossover cable, but when I looked at the mail archives related to that
} :problem I read that the intel cards are very aggressive packet pushers,
} :and that this isn't all that unusual. The ratio of good packets to
} :collisions seemed healthy enough to not warrant too much concern. 
} 
} 28967 input errors on xl0?  Problem!

These are probably the frames where fxp0 sensed a late collsion and
aborted packet transmission, resulting in a CRC error.

} But the real problem is that you are attempting to do 10BaseT 
} full-duplex.  Full-duplex operation with 10BaseT is problematic
} at best.  Full duplex has good interoperability at 100BaseTX speeds,
} but not at 10BaseT speeds.

10BaseT full-duplex should work ok as long as you configure everything
manually.  The only way it could work auto-magically would be if both
cards used Nway, which you'll only see on 10/100 or 100BaseTX cards
and if you've got two of those they'll negotiate 100 Mbit speeds :-)

} Crossover cables work fine, usually, but I personally *never* use them.
} I always throw a switch in between the machines and let it negotiate
} the duplex mode with each machine independantly,

twice as many chances to get things wrong, too.

} plus it gives me nice
} shiny LEDs that tell me what the switch thinks the port is doing as
} a sanity check.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Keeping using locally modified source

2000-03-03 Thread Don Lewis

On Mar 3, 11:47am, Assar Westerlund wrote:
} Subject: Re: Keeping using locally modified source

} There's even a hack in FreeBSD cvs and cvsup to allow you to keep a
} `local' branch that's not clobbered by cvsup, namely the environment
} variable CVS_LOCAL_BRANCH_NUM.

I thought about using this, but it doesn't appear to be easy to track
changes to an official branch.  I was looking for something that would
be as easy tracking changes made by infrequent imports on the vendor
branch.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: why FFS is THAT slower than EXT2 ?

1999-10-27 Thread Don Lewis

On Oct 27,  2:51pm, Julian Elischer wrote:
} Subject: Re: why FFS is THAT slower than EXT2 ?
} 
} 
} On Wed, 27 Oct 1999, Mike Smith wrote:
} 
} > > in order to save space I gzip'ped output of my tests. 
} > > ungzipping ports tarball on FreeBSD took 28 min
} > > on Linux --- about 2.5 times faster.
} > 
} > This is something we already know, and it's not the sort of test that 
} > you should ever headline as "why is FFS so much slower"?
} 
} Kirk has said that it would be possible for the FFS to modify its
} behaviour if it notices this usage pattern.

The basic problem is that the directory layout policy that FFS uses
is very non-optimal in this case.  This was discussed extensively
on freebsd-hackers last year, search the list archive for
Reading/writing /usr/ports VERY slow

Carl Mascott noticed nearly a 3x speedup when he untarred /usr/ports
on FFS filesystem that was generated with only one cylinder group.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Developer assessment (was Re: A bike shed ...)

1999-10-06 Thread Don Lewis

On Oct 4, 12:52am, Darryl Okahata wrote:
} Subject: Re: Developer assessment (was Re: A bike shed ...)

} 1. Instant escalation.  Example: supplicant A asks question in FreeBSD
}group.  Some FreeBSD contributor says, "RTFM", and does not give any
}useful information whatsoever like which "FM" or even a vague area.
}Supplicant A asks for more information, said FreeBSD contributor
}insults supplicant A for being clueless newbie crud and flamefest
}results.
} 
}Lesson: if you can't say anything nice, don't say it at all.  Look at
}it this way: you won't have wasted your time, your blood pressure
}will be lower, and you won't look stupid for having stooped to
}insults, which also doesn't reflect well upon the FreeBSD
}contributors.

This is reasonable advice, though what sometimes happens is that A is
then frustrated by the deafening silence and either spams multiple
lists or becomes irate because he thinks he is being actively shunned.
This can also be a problem if the question isn't answered in the FM and
the few FreeBSD experts who are competent to answer the question are
too busy to answer right away or are otherwise distracted.

I think it would lower the frustration level all around if we had some
volunteer question answerers to take some of the load off the developers.
It doesn't sound like fun to me, but neither does doing documentation,
but somehow FreeBSD has managed to find volunteers who only work on
the documentation.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: mbuf shortage situations

1999-09-07 Thread Don Lewis
On Sep 5,  9:18pm, Matthew Dillon wrote:
} Subject: Re: mbuf shortage situations
} : The only reason that I see for which we would actually panic() in
} :this situation (as opposed to suffer the packet loss) is if we get to the
} :point where we're losing packets because some script kid starts up
} :something that will eat up sockbuf space and continuously fork, then we
} :would lose all remote access to the machine in question (since all packets
} :would be dropped) and we wouldn't really mind a panic() for obvious
} :practical reasons.

Well, I really would mind the panic().

} : In any case, I, personally, would prefer to suffer packet loss as
} :opposed to a panic (especially now that Brian is in the process of writing
} :diffs that will allow us to limit socket buffer space per UID through
} :login.conf!)
} : Having MGET store that null (e.g. fail as opposed to panic) on a
} :M_WAIT seems fairly easy to fix, and would probably require some patching
} :that would ensure that the packet loss is handeled relatively 'cleanly'
} :(probably some debugging), but I wouldn't mind doing this. However, I'd
} :like to know if there are objections to doing this or, in fact, if there
} :are any suggestions on how to handle mbuf shortage situations (aside from
} :just limiting -- although limiting is in itself a good solution and I'm
} :glad that Brian F. is working on that).

At least historically most of the panics have been caused by the code
not properly checking the result of the MGET and dereferencing a NULL
pointer.  Any of those that are still in the code need to be fixed.

My impression is that for reasonably recent versions of FreeBSD
this attack doesn't panic the machine but just wedges the network
system due to mbuf exhaustion.  The problem is that if you get to
this point you're basically hosed.  It's OK to toss packets that
you receive from the net as long as you haven't sent an ack for them,
toss outgoing UDP packets, and block writes to stream sockets, but
you can't toss acked TCP packets that you've received, or the data
queued to a stream socket by write().  This particular attack does
the latter, so the only possible fix is to prevent all the mbufs
from being consumed by it in the first place.

} The issue is basically having someone find the time to figure out
} how to gracefully unwind various pieces of network code when an
} mbuf cannot be allocated.  Once that is done, the panic can be
} turned into a (rate-limited) printf.

That won't help.  All that does is keep a root spinning on a failed
syscall instead of blocking on MGET when he's trying to log in to
kill the errant process.


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: mbuf shortage situations

1999-09-07 Thread Don Lewis

On Sep 5,  9:18pm, Matthew Dillon wrote:
} Subject: Re: mbuf shortage situations
} : The only reason that I see for which we would actually panic() in
} :this situation (as opposed to suffer the packet loss) is if we get to the
} :point where we're losing packets because some script kid starts up
} :something that will eat up sockbuf space and continuously fork, then we
} :would lose all remote access to the machine in question (since all packets
} :would be dropped) and we wouldn't really mind a panic() for obvious
} :practical reasons.

Well, I really would mind the panic().

} : In any case, I, personally, would prefer to suffer packet loss as
} :opposed to a panic (especially now that Brian is in the process of writing
} :diffs that will allow us to limit socket buffer space per UID through
} :login.conf!)
} : Having MGET store that null (e.g. fail as opposed to panic) on a
} :M_WAIT seems fairly easy to fix, and would probably require some patching
} :that would ensure that the packet loss is handeled relatively 'cleanly'
} :(probably some debugging), but I wouldn't mind doing this. However, I'd
} :like to know if there are objections to doing this or, in fact, if there
} :are any suggestions on how to handle mbuf shortage situations (aside from
} :just limiting -- although limiting is in itself a good solution and I'm
} :glad that Brian F. is working on that).

At least historically most of the panics have been caused by the code
not properly checking the result of the MGET and dereferencing a NULL
pointer.  Any of those that are still in the code need to be fixed.

My impression is that for reasonably recent versions of FreeBSD
this attack doesn't panic the machine but just wedges the network
system due to mbuf exhaustion.  The problem is that if you get to
this point you're basically hosed.  It's OK to toss packets that
you receive from the net as long as you haven't sent an ack for them,
toss outgoing UDP packets, and block writes to stream sockets, but
you can't toss acked TCP packets that you've received, or the data
queued to a stream socket by write().  This particular attack does
the latter, so the only possible fix is to prevent all the mbufs
from being consumed by it in the first place.

} The issue is basically having someone find the time to figure out
} how to gracefully unwind various pieces of network code when an
} mbuf cannot be allocated.  Once that is done, the panic can be
} turned into a (rate-limited) printf.

That won't help.  All that does is keep a root spinning on a failed
syscall instead of blocking on MGET when he's trying to log in to
kill the errant process.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: mmap mapped segment length

1999-08-21 Thread Don Lewis
On Aug 21,  2:10am, Wes Peters wrote:
} Subject: mmap mapped segment length
} I discovered to my dismay today that the length field in the mmap call is
} a size_t, not an off_t.  I was attempting to process a large (~50 MByte) file
} and found I was only processing the first 4 MBytes of it.

50 MB should comfortably fit in a size_t.

} Is this intentional, or just an artifact of the implementation?  Is there any
} reason NOT to change this to an off_t?

The type of size_t is supposed to be large enough to express the length of
any object that will fit in the virtual address space of a process.  Since
a size_t is 32 bits on an i386 and pointers are also 32 bits, there wouldn't
be any advantage to changing mmap() to use a 64 bit wide length parameter,
since you wouldn't be able to access all of such a large object.


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: mmap mapped segment length

1999-08-21 Thread Don Lewis

On Aug 21,  2:10am, Wes Peters wrote:
} Subject: mmap mapped segment length
} I discovered to my dismay today that the length field in the mmap call is
} a size_t, not an off_t.  I was attempting to process a large (~50 MByte) file
} and found I was only processing the first 4 MBytes of it.

50 MB should comfortably fit in a size_t.

} Is this intentional, or just an artifact of the implementation?  Is there any
} reason NOT to change this to an off_t?

The type of size_t is supposed to be large enough to express the length of
any object that will fit in the virtual address space of a process.  Since
a size_t is 32 bits on an i386 and pointers are also 32 bits, there wouldn't
be any advantage to changing mmap() to use a 64 bit wide length parameter,
since you wouldn't be able to access all of such a large object.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: BSD XFS Port & BSD VFS Rewrite

1999-08-17 Thread Don Lewis
On Aug 16,  9:18pm, Terry Lambert wrote:
} Subject: Re: BSD XFS Port & BSD VFS Rewrite

} > I don't see how the namei recursion method prevents catching // as a
} > namespace escape.
} 
} 
} //apple-resource-fork/intermediate_dir/some_other_dir/file_with_fork
} 
} You can't inherit the fact that you are looking at the resource fork
} in the terminal component, ONLY.

I don't think this is a good example.  How would you access the resource
fork of a file relative to the current directory?  IMHO, the necessary
goop needs to go at the end of the path name.


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: BSD XFS Port & BSD VFS Rewrite

1999-08-17 Thread Don Lewis

On Aug 16,  9:18pm, Terry Lambert wrote:
} Subject: Re: BSD XFS Port & BSD VFS Rewrite

} > I don't see how the namei recursion method prevents catching // as a
} > namespace escape.
} 
} 
} //apple-resource-fork/intermediate_dir/some_other_dir/file_with_fork
} 
} You can't inherit the fact that you are looking at the resource fork
} in the terminal component, ONLY.

I don't think this is a good example.  How would you access the resource
fork of a file relative to the current directory?  IMHO, the necessary
goop needs to go at the end of the path name.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: gethostbyaddr() and threads.

1999-08-10 Thread Don Lewis
On Aug 9,  9:21pm, Dan Moschuk wrote:
} Subject: Re: gethostbyaddr() and threads.
} 
} | Well, I guess we might as well change the API, since everyone else does. 
Unless
} | someone comes up with a bettter idea, of course :)
} | 
} | -Joe
} 
} The API should not change.  There is already enough descrepency between UNIXs
} to warrant programs like autoconf, we should not introduce another.
} We should introduce a gethostbyaddr_r function, which shouldn't be all that
} though to implement.
} 
} >From the code that I looked at today, the problems lie inside of glibc.  It 
} declares globally a few static variables that are used by the gethost* 
} functions.  Obviously in a threaded environment, this is bad.
} 
} A nice fix would be to get rid of those variables entirely.  A quicker fix 
} would be just to enclose those global variables in mutexes.  Personally, I 
} like the nicer fix better, which will (unfortunately) involve rewriting most 
} of the frontends to the res_* functions.
} 
} If no one has any objections, I'd like to start on this tomorrow.

You might want to grab the latest BIND release from ftp.isc.org.  One
of the comments in the CHANGES file from a while ago is:

 384.   [feature]   there is now a nearly-thread-safe resolver API, with
the old non-thread-safe API being a set of stubs on
top of this.  it is possible to program without _res.
note: the documentation has not been updated.  also
note: IRS is a thread-ready API, get*by*() is not.
(see ../contrib/manyhosts for an example application.)

There's no sense re-inventing any more wheels than necessary.


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: gethostbyaddr() and threads.

1999-08-10 Thread Don Lewis

On Aug 9,  9:21pm, Dan Moschuk wrote:
} Subject: Re: gethostbyaddr() and threads.
} 
} | Well, I guess we might as well change the API, since everyone else does. Unless
} | someone comes up with a bettter idea, of course :)
} | 
} | -Joe
} 
} The API should not change.  There is already enough descrepency between UNIXs
} to warrant programs like autoconf, we should not introduce another.
} We should introduce a gethostbyaddr_r function, which shouldn't be all that
} though to implement.
} 
} >From the code that I looked at today, the problems lie inside of glibc.  It 
} declares globally a few static variables that are used by the gethost* 
} functions.  Obviously in a threaded environment, this is bad.
} 
} A nice fix would be to get rid of those variables entirely.  A quicker fix 
} would be just to enclose those global variables in mutexes.  Personally, I 
} like the nicer fix better, which will (unfortunately) involve rewriting most 
} of the frontends to the res_* functions.
} 
} If no one has any objections, I'd like to start on this tomorrow.

You might want to grab the latest BIND release from ftp.isc.org.  One
of the comments in the CHANGES file from a while ago is:

 384.   [feature]   there is now a nearly-thread-safe resolver API, with
the old non-thread-safe API being a set of stubs on
top of this.  it is possible to program without _res.
note: the documentation has not been updated.  also
note: IRS is a thread-ready API, get*by*() is not.
(see ../contrib/manyhosts for an example application.)

There's no sense re-inventing any more wheels than necessary.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: quad_t and portability

1999-08-06 Thread Don Lewis
On Aug 6,  3:29pm, Sheldon Hearn wrote:
} Subject: quad_t and portability
} 
} Hi folks,
} 
} I want to patch wc(1) so that it uses quad_t instead of u_long. This is
} necessary if wc(1) is to produce sensible results for files containing
} more than 4GB of data.

Why not off_t, which should be portable and scale properly with the
maximum system file size.  Then the only problem is figuring a portable
means of printing the result ...


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



Re: quad_t and portability

1999-08-06 Thread Don Lewis

On Aug 6,  3:29pm, Sheldon Hearn wrote:
} Subject: quad_t and portability
} 
} Hi folks,
} 
} I want to patch wc(1) so that it uses quad_t instead of u_long. This is
} necessary if wc(1) is to produce sensible results for files containing
} more than 4GB of data.

Why not off_t, which should be portable and scale properly with the
maximum system file size.  Then the only problem is figuring a portable
means of printing the result ...


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Pictures from USENIX

1999-07-07 Thread Don Lewis
On Jul 4,  5:35pm, "Jonathan M. Bresler" wrote:
} Subject: Re: Pictures from USENIX

}   beards are great...women love them, getting fluffed is much
} better than getting scratchedkids love them.  brush the beard
} whenever you brush your hair.  dont hae to deal with a buzzing razor,
} very unkind to newly awoken folk.  dont ahve to wield a blade across
} you neck in a fogged monring stupor.
} 
} jmb--i aint shaved in 18 years.

I've got you beat by 4.5 years ;-)


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-hackers" in the body of the message



  1   2   >