from:"Ian Dowse"

Re: kernel panic trying to utilize a da(4)/umass(4) device with ohci(4)

2003-11-20 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, "Brian F. Feldman"
 writes:
>Jeez, it's been broken a year and it's almost 5.2-RELEASE now.  Does anyone 
>have ANY leads on these problems?  I know precisely nothing about how my USB 
>hardware is supposed to work, but this OHCI+EHCI stuff definitely doesn't, 
>and it's really not uncommon at all.  Is it unbroken in NetBSD currently?

I had some success with this patch:

Index: usb_mem.c
===
RCS file: /dump/FreeBSD-CVS/src/sys/dev/usb/usb_mem.c,v
retrieving revision 1.5
diff -u -r1.5 usb_mem.c
--- usb_mem.c   4 Oct 2003 22:13:21 -   1.5
+++ usb_mem.c   27 Oct 2003 15:39:03 -
@@ -142,7 +142,8 @@
s = splusb();
/* First check the free list. */
for (p = LIST_FIRST(&usb_blk_freelist); p; p = LIST_NEXT(p, next)) {
-   if (p->tag == tag && p->size >= size && p->align >= align) {
+   if (p->tag == tag && p->size >= size && p->size < size * 2 &&
+   p->align >= align) {
LIST_REMOVE(p, next);
usb_blk_nfree--;
splx(s);

It seems that since the conversion to busdma, the USB code can end
up attempting to use contigmalloc() to allocate multi-page regions
from an interrupt thread(!). The above doesn't fix that; it just
prevents successful large (e.g 64k) contiguous allocations from
being wasted when a much smaller amount of space is needed.

With the above, I was able to use ohci+ehci fairly reliably on a
Soekris box with a large USB2 disk attached via a cardbus USB2
adaptor. I also have a few other local patches that may help too -
some of them are below:

Ian

Index: ohci.c
===
RCS file: /dump/FreeBSD-CVS/src/sys/dev/usb/ohci.c,v
retrieving revision 1.132
diff -u -r1.132 ohci.c
--- ohci.c  24 Aug 2003 17:55:54 -  1.132
+++ ohci.c  21 Sep 2003 15:28:27 -
@@ -1405,12 +1405,13 @@
if (std->flags & OHCI_ADD_LEN)
xfer->actlen += len;
if (std->flags & OHCI_CALL_DONE) {
+   ohci_free_std(sc, std); /* XXX */
xfer->status = USBD_NORMAL_COMPLETION;
s = splusb();
usb_transfer_complete(xfer);
splx(s);
-   }
-   ohci_free_std(sc, std);
+   } else
+   ohci_free_std(sc, std);
} else {
/*
 * Endpoint is halted.  First unlink all the TDs
@@ -2246,6 +2247,7 @@
usb_uncallout(xfer->timeout_handle, ohci_timeout, xfer);
usb_transfer_complete(xfer);
splx(s);
+   return;
}
 
if (xfer->device->bus->intr_context || !curproc)
Index: usbdi.c
===
RCS file: /dump/FreeBSD-CVS/src/sys/dev/usb/usbdi.c,v
retrieving revision 1.82
diff -u -r1.82 usbdi.c
--- usbdi.c 24 Aug 2003 17:55:55 -  1.82
+++ usbdi.c 21 Sep 2003 15:28:29 -
@@ -751,6 +751,7 @@
pipe, xfer, pipe->methods));
/* Make the HC abort it (and invoke the callback). */
pipe->methods->abort(xfer);
+   KASSERT(SIMPLEQ_FIRST(&pipe->queue) != xfer, ("usbd_ar_pipe"));
/* XXX only for non-0 usbd_clear_endpoint_stall(pipe); */
}
pipe->aborting = 0;
@@ -763,8 +764,9 @@
 {
usbd_pipe_handle pipe = xfer->pipe;
usb_dma_t *dmap = &xfer->dmabuf;
+   usbd_status status;
int repeat = pipe->repeat;
-   int polling;
+   int polling, xfer_flags;
 
SPLUSBCHECK;
 
@@ -835,30 +837,33 @@
xfer->status = USBD_SHORT_XFER;
}
 
-   if (xfer->callback)
-   xfer->callback(xfer, xfer->priv, xfer->status);
-
-#ifdef DIAGNOSTIC
-   if (pipe->methods->done != NULL)
+   /* Copy any xfer fields in case the xfer goes away in the callback. */
+   status = xfer->status;
+   xfer_flags = xfer->flags;
+   /*
+* For repeat operations, call the callback first, as the xfer
+* will not go away and the "done" method may modify it. Otherwise
+* reverse the order in case the callback wants to free or reuse
+* the xfer.
+*/
+   if (repeat) {
+   if (xfer->callback)
+   xfer->callback(xfer, xfer->priv, status);
pipe->methods->done(xfer);
-   else
-   printf("usb_transfer_complete: pipe->methods->done == NULL\n");
-#else
-   pipe->methods->done(xfer);
-#endif
-
-   if ((xfer->flags & USBD_SYNCHRONOUS) && !polling)
-   wakeup(xfer);
+   } else {
+

Re: cvs commit: src/sbin/umount umount.c

2003-11-18 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Rudolf Cejka writes:
>>   If the unmount by file system ID fails, don't warn before retrying
>>   a non-fsid unmount if the file system ID is all zeros. This is a
...
>Hello and thanks for fixing this! I had a plan to report this, but you
>were faster :o) I'm interested in this area - please, can you tell, what
>do you plan to do in your more complete fix?
>
>When I looked at this issue, I thought about some things:
>
>* Why is f_fsid zeroed for non-root users at all? Is there any reason?

As I understand it, the main reason for hiding file system IDs from
non-root users is beacuse file system IDs are used as part of NFS
file handles on an NFS server, so hiding them makes it harder to
guess a valid file handle. If you know the file system ID and an
inode number, then you would only need to guess the 32-bit inode
generation number.

OpenBSD started zeroing out file system IDs for non-root users a
long time ago, and while FreeBSD mostly followed suit, I think it
was only with Kirk's 64-bit statfs changes a few days ago that we
have started doing this consistently (we had missed getfsstat()
before).

I was planning to return a filesystem ID of {st_dev, 0} to non-root
users, where st_dev is the device number that is already returned
by the stat(2) system call. This requires a few changes, because
currently st_dev comes from va_fsid in struct vattr, which is not
directly accessible at the time a file system is mounted. Since
many userland applications depend on st_dev being persistent and
unique, I think it makes more sense to have it as part of struct
mount instead of struct vattr.

Some additional changes are required to guarantee the uniqueness
of st_dev's and file system IDs (including {st_dev, 0} ones), and
then unmount(2) needs to accept these user-visible IDs. In fact,
maybe {st_dev, 0} could be returned to root too, but that might
possibly break some NFS-related utilities.

>* There are small typos in umount.c:

Thanks - fixed locally, but there's no urgency to commit before
5.2.

>* Do you understand, why there is line in umount.c:376
>  getmntentry(NULL, NULL, &sfs->f_fsid, REMOVE)
>  ? I'm not sure, but if it is needed for some reason,
>  I think that there should be used different getmntentry() according
>  to the used unmount() method, like in this patch:

I think umount(8) first gets a list of all mounted file systems and
then uses that list to resolve a mountpoint or device node into a
a struct statfs. When unmounting all file systems, it needs to
ignore any file systems that it has already unmounted, or it might
attempt to unmount the same file system twice. If the unmount call
fails, it should still do the REMOVE operation so that it will at
least attempt an unmount on each file system.

You're right that this will not work correctly with zeroed file
system IDs (it worked before Kirk's commit last week, but wasn't
supposed to). In practice can it ever make things worse than the
uniqueness problems caused by non-root users not having no file
system ID? I can't think of any examples offhand.

>* /usr/src/sbin/mount/mount.c: If user uses mount -v, it prints false
>  zeroed fsids - isn't it better to print just non-zero fsids, so that
>  nobody is "mystified"? I have created two patches - I do not know
>  which do you consider as a better:

Yes, I guess now that getfsstat(2) also zeros the IDs for non-root,
there isn't much point in printing them.

Ian
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: INPCB panic....

2003-11-10 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Sam Leffler writes:
>On Monday 10 November 2003 11:37 am, Larry Rosenman wrote:
>> I removed my wi0 card (with DHCLIENT running), and got the following panic
>> on a -CURRENT from yesterday:
>
>Thanks.  Working on it...

FYI, I've been using the following patch locally which seems to
trigger the printf sometimes when wi0 is ejected. Without the patch,
it used to dereference a stale struct ifnet and crash. I have an
approx 1 week old kernel, so this particular problem may have been
fixed already.

Ian

Index: in_pcb.c
===
RCS file: /dump/FreeBSD-CVS/src/sys/netinet/in_pcb.c,v
retrieving revision 1.125
diff -u -r1.125 in_pcb.c
--- in_pcb.c1 Nov 2003 07:30:07 -   1.125
+++ in_pcb.c3 Nov 2003 00:52:41 -
@@ -564,10 +564,12 @@
 * destination, in case of sharing the cache with IPv6.
 */
ro = &inp->inp_route;
-   if (ro->ro_rt &&
-   (ro->ro_dst.sa_family != AF_INET ||
-satosin(&ro->ro_dst)->sin_addr.s_addr != faddr.s_addr ||
-inp->inp_socket->so_options & SO_DONTROUTE)) {
+   if (ro->ro_rt && ((ro->ro_rt->rt_flags & RTF_UP) == 0 ||
+   ro->ro_dst.sa_family != AF_INET ||
+   satosin(&ro->ro_dst)->sin_addr.s_addr != faddr.s_addr ||
+   inp->inp_socket->so_options & SO_DONTROUTE)) {
+   if ((ro->ro_rt->rt_flags & RTF_UP) == 0)
+   printf("clearing non-RTF_UP route\n");
RTFREE(ro->ro_rt);
ro->ro_rt = (struct rtentry *)0;
}
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Fixing -pthreads (Re: ports and -current)

2003-09-24 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Daniel
 Eischen writes:
>On Wed, 24 Sep 2003, Scott Long wrote:
>> PTHREAD_LIBS is a great tool for the /usr/ports mechanism, but doesn't
>> mean anything outside of that.
>
>That just meant it makes it easier to maintain ports so that
>they are PTHREAD_LIBS compliant (they would break when linked).
>I know it has no bearing on 3rd party stuff.

Just to throw one further approach out on the table, below is a
patch that makes gcc read from a file to determine what library to
associate with the -pthread flag. It's a hack of course, and probably
neither correct or optimal. If you want to make -pthread mean libkse,
create an /etc/pthread.libs that looks like:

-lc_r:  -lkse
-lc_r_p:-lkse_p

I haven't been following the discussion in any detail - this is
just another possibility that is worth mentioning as it could retain
compatibility for users that want -pthread to mean use the default
thread library.

Ian

Index: gcc.c
===
RCS file: /dump/FreeBSD-CVS/src/contrib/gcc/gcc.c,v
retrieving revision 1.36
diff -u -r1.36 gcc.c
--- gcc.c   11 Jul 2003 04:45:39 -  1.36
+++ gcc.c   24 Sep 2003 15:37:14 -
@@ -331,6 +331,7 @@
 
 static const char *if_exists_spec_function PARAMS ((int, const char **));
 static const char *if_exists_else_spec_function PARAMS ((int, const char **));
+static const char *thread_lib_override_spec_function PARAMS ((int, const char **));
 
 /* The Specs Language
 
@@ -1440,6 +1441,7 @@
 {
   { "if-exists",   if_exists_spec_function },
   { "if-exists-else",  if_exists_else_spec_function },
+  { "thread-lib-override", thread_lib_override_spec_function },
   { 0, 0 }
 };
 
@@ -7335,4 +7337,46 @@
 return argv[0];
 
   return argv[1];
+}
+
+/* thread-lib-override built-in spec function.
+
+   Override a thread library according to /etc/pthread.libs  */
+
+static const char *
+thread_lib_override_spec_function (argc, argv)
+ int argc;
+ const char **argv;
+{
+  static char buf[256];
+  FILE *fp;
+  int n;
+  
+  /* Must have exactly one argument.  */
+  if (argc != 1)
+return NULL;
+
+  fp = fopen ("/etc/pthread.libs", "r");
+  if (fp == NULL)
+return argv[0];
+
+  while (fgets (buf, sizeof(buf), fp) != NULL)
+{
+  n = strlen (buf);
+  while (n > 0 && buf[n - 1] == '\n')
+   buf[--n] = '\0';
+  if (buf[0] == '#' || buf[0] == '\0')
+   continue;
+  n = strlen (argv[0]);
+  if (strncmp (buf, argv[0], n) != 0 || n >= sizeof (buf) || buf[n] != ':')
+   continue;
+  n++;
+  while (buf[n] != '\0' && isspace ((unsigned char)buf[n]))
+   n++;
+  fclose (fp);
+
+  return &buf[n];
+}
+  fclose (fp);
+  return argv[0];
 }
Index: config/freebsd-spec.h
===
RCS file: /dump/FreeBSD-CVS/src/contrib/gcc/config/freebsd-spec.h,v
retrieving revision 1.14
diff -u -r1.14 freebsd-spec.h
--- config/freebsd-spec.h   21 Sep 2003 07:59:16 -  1.14
+++ config/freebsd-spec.h   24 Sep 2003 15:38:11 -
@@ -160,8 +160,8 @@
 #if __FreeBSD_version >= 500016
 #define FBSD_LIB_SPEC "\
   %{!shared:   \
-%{!pg: %{pthread:-lc_r} -lc}   \
-%{pg:  %{pthread:-lc_r_p} -lc_p}   \
+%{!pg: %{pthread:%:thread-lib-override(-lc_r)} -lc}\
+%{pg:  %{pthread:%:thread-lib-override(-lc_r_p)} -lc_p}\
   }"
 #else
 #define FBSD_LIB_SPEC "\
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: PLIP transmit timeouts -- any solutions?

2003-08-14 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Christopher Nehren writes:
>
>--=-7MVWKH2AJ0lqXf3q30++
>Content-Type: text/plain
>Content-Transfer-Encoding: quoted-printable
>
>I currently have a PLIP link to an old laptop running Linux (I tried to
>install FreeBSD, but it freezes at the USB detection -- yes, I tried

Try the following patch. I can't remember if all the changes in
this are necessary, but I think I found it fixed problems when
interoperating with a Linux-like PLIP implementation.

If I remember correctly, the PLIP implementation I saw used the
data bits that came in the very first read that had the correct
handshake signal, whereas FreeBSD readers do one extra read after
the handshake to ensure that the signal is stable (i.e. that
implementation used an "unsafe" read and a "safe" write, whereas
FreeBSD's uses a "safe" read and an "unsafe" write). This patch
causes both read and write to be "safe". The removal of the use of
ctxmitl[] seems to be unnecessary.

Ian

Index: if_plip.c
===
RCS file: /dump/FreeBSD-CVS/src/sys/dev/ppbus/if_plip.c,v
retrieving revision 1.28
diff -u -r1.28 if_plip.c
--- if_plip.c   4 Mar 2003 23:19:54 -   1.28
+++ if_plip.c   12 Mar 2003 07:09:43 -
@@ -409,12 +409,14 @@
 static __inline int
 clpoutbyte (u_char byte, int spin, device_t ppbus)
 {
-   ppb_wdtr(ppbus, ctxmitl[byte]);
+   ppb_wdtr(ppbus, byte & 0xf);
+   ppb_wdtr(ppbus, (byte & 0xf) | 0x10);
while (ppb_rstr(ppbus) & CLPIP_SHAKE)
if (--spin == 0) {
return 1;
}
-   ppb_wdtr(ppbus, ctxmith[byte]);
+   ppb_wdtr(ppbus, ((byte & 0xf0) >> 4) | 0x10);
+   ppb_wdtr(ppbus, ((byte & 0xf0) >> 4));
while (!(ppb_rstr(ppbus) & CLPIP_SHAKE))
if (--spin == 0) {
return 1;
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Problem with umount and fsid?

2003-07-18 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Nate Lawson writes:
>I get an error when umounting a FAT filesystem on a USB flash drive.  It
>appears the device is properly unmounted.  Is this a case that needs to be
>fixed in our fsid code?  It happens every time I unmount this device.

>laptop# umount /thumb
>umount: unmount of /thumb failed: No such file or directory
>umount: retrying using path instead of file system ID
>laptop# mount | grep da0
>laptop#

Thanks for the report - in theory this should only occur if you
have a kernel from before July 1st but a newer userland. Assuming
that's not the case, I must have overlooked something. Could you
update to the latest sbin/mount, and then post the output of:

mount -v | grep /thumb
truss umount /thumb

Thanks,

Ian
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Heads up: checking in change to ata-card.c

2003-06-26 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, "M. Warner Losh" writes:
>Here's a better patch, basesd on wpaul's input.  Bill, can you try it
>an see if it works for you?  If so, i would be better to commit this
>one.  If not, I'll work with you to fix it. 

FYI, I have a no-name ("PCMCIA"/"CD-ROM") drive that also requires
failure of the second IO range to be made non-fatal. How about just
deleting the `else' clause as in the patch below? It seems that
this can only affect CD-ROM drives that were otherwise not working,
so it should be fairly safe.

Ian

Index: ata-card.c
===
RCS file: /dump/FreeBSD-CVS/src/sys/dev/ata/ata-card.c,v
retrieving revision 1.14
diff -u -r1.14 ata-card.c
--- ata-card.c  17 Jun 2003 12:33:53 -  1.14
+++ ata-card.c  26 Jun 2003 23:00:01 -
@@ -131,10 +131,6 @@
 start + ATA_ALTOFFSET, ATA_ALTIOSIZE);
}
 }
-else {
-   bus_release_resource(dev, SYS_RES_IOPORT, rid, io);
-   return ENXIO;
-}
 
 /* allocate the altport range */
 rid = ATA_ALTADDR_RID;

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: msgbuf cksum mismatch (read 933e3, calc 93fbe)

2003-06-05 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Paolo Pisati writes:
>
>What does it mean?
>
>It's the first row in my today's kernel.

You can safely ignore it. Some BIOSes don't clear the RAM during a
reboot, so when booting up, FreeBSD attempts to pick up the kernel
message buffer from before the reboot (this can be very handy if
the reboot was caused by a panic). The above message indicates that
the message buffer from the last boot initially appeared to be
intact, but its checksum didn't match the contents, so it was
cleared.

I guess the message could be changed to cause less alarm, or it
could be hidden behind bootverbose; I just thought it useful to
indicate that the previous message buffer was mostly there in case
somebody who really needs it preserved wants to disable the check.
The behaviour here could possibly also be made a loader.conf tunable,
but I didn't test whether tunables can be used that early in the
boot process.

Ian
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: NFS -current

2003-03-26 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Terry Lambert writes:
>Ian Dowse wrote:
>> It is normal enough to get the above 'not responding' errors
>> occasionally on a busy fileserver, but only if they are almost
>> immediately followed by 'is alive again' messages.
>
>This is arguably a bug in the FreeBSD UDP packet reassembly code
...

Actually, I was referring here to an effect that occurs when the
time taken by the server to complete requests varies in a particular
way. The NFS client may observe a large number of requests all
answered within a few milliseconds, so it starts using short timeouts.

Then for some reason (usually a long list of outstanding disk-intensive
operations), the server takes a few seconds to complete the next
request. Within this time, the client repeatedly times out, retransmits
the request, backs off and repeats, and in extreme cases it is
possible for the client to reach the limit that triggers the "not
responding" warning.

Ian
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: NFS -current

2003-03-25 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Patric Mrawek writes:
>On several clients (-DP1, -DP2, 4-stable) mounting a nfs-share
>(mount_nfs -i -U -3 server:/nfs /mnt) and then copying data from that
>share to the local disk (find -x -d /mnt | cpio -pdumv /local) results
>in lost NFS-mount.
>
>client kernel: nfs server server:/nfs: not responding 10 > 9

I'm not sure what you mean by a "lost" mount. Do all further accesses
to the filesystem hang?

It is normal enough to get the above 'not responding' errors
occasionally on a busy fileserver, but only if they are almost
immediately followed by 'is alive again' messages.

If the filesystem stops working and doesn't recover, could you run
`tcpdump -nepX -s 1600 udp port 2049' when it hangs and record a
few packets?

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: panic: lockmgr: draining against myself

2002-12-29 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, ryan beasley writes:
>
>panic: lockmgr: draining against myself

I've just checked in revision 1.426 of vfs_subr.c which may solve
this, but I was not able to reproduce it myself. Could you or anybody
else who has seen this panic try the above revision to see if it
helps? Note that you will need HEAD rather than RELENG_5_0 to get
this change.

There is probably a better approach to solve the VOP_INACTIVE
recursion problem than the one I used though - I think maybe having
a vnode flag that remembers whether a VOP_INACTIVE call is necessary
would be more general than the VI_DOINGINACT flag.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: backgroud fsck is still locking up system (fwd)

2002-12-07 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Kirk McKusick wr
ites:
>Adding a two minute delay before starting background fsck
>sounds like a very good idea to me. Please send me your
>suggested change.

BTW, I've been using a fsck_ffs modificaton for a while now that
does something like the disabled kernel I/O slowdown, but from
userland. It seems to help quite a lot in leaving some disk bandwidth
for other processes. Waiting a while before starting the fsck seems
like a good idea anyway though. Patch below (I think I posted an
earlier version of this before).

Ian

Index: fsutil.c
===
RCS file: /dump/FreeBSD-CVS/src/sbin/fsck_ffs/fsutil.c,v
retrieving revision 1.19
diff -u -r1.19 fsutil.c
--- fsutil.c27 Nov 2002 02:18:57 -  1.19
+++ fsutil.c4 Dec 2002 02:16:28 -
@@ -40,6 +40,7 @@
 #endif /* not lint */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -62,7 +63,13 @@
 
 #include "fsck.h"
 
+static void slowio_start(void);
+static void slowio_end(void);
+
 long   diskreads, totalreads;  /* Disk cache statistics */
+struct timeval slowio_starttime;
+int slowio_delay_usec = 1; /* Initial IO delay for background fsck */
+int slowio_pollcnt;
 
 int
 ftypeok(union dinode *dp)
@@ -350,10 +357,15 @@
 
offset = blk;
offset *= dev_bsize;
+   if (bkgrdflag)
+   slowio_start();
if (lseek(fd, offset, 0) < 0)
rwerror("SEEK BLK", blk);
-   else if (read(fd, buf, (int)size) == size)
+   else if (read(fd, buf, (int)size) == size) {
+   if (bkgrdflag)
+   slowio_end();
return (0);
+   }
rwerror("READ BLK", blk);
if (lseek(fd, offset, 0) < 0)
rwerror("SEEK BLK", blk);
@@ -463,6 +475,39 @@
idesc.id_blkno = blkno;
idesc.id_numfrags = frags;
(void)pass4check(&idesc);
+}
+
+/* Slow down IO so as to leave some disk bandwidth for other processes */
+void
+slowio_start()
+{
+
+   /* Delay one in every 8 operations by 16 times the average IO delay */
+   slowio_pollcnt = (slowio_pollcnt + 1) & 7;
+   if (slowio_pollcnt == 0) {
+   usleep(slowio_delay_usec * 16);
+   gettimeofday(&slowio_starttime, NULL);
+   }
+}
+
+void
+slowio_end()
+{
+   struct timeval tv;
+   int delay_usec;
+
+   if (slowio_pollcnt != 0)
+   return;
+
+   /* Update the slowdown interval. */
+   gettimeofday(&tv, NULL);
+   delay_usec = (tv.tv_sec - slowio_starttime.tv_sec) * 100 +
+   (tv.tv_usec - slowio_starttime.tv_usec);
+   if (delay_usec < 64)
+   delay_usec = 64;
+   if (delay_usec > 100)
+   delay_usec = 100;
+   slowio_delay_usec = (slowio_delay_usec * 63 + delay_usec) >> 6;
 }
 
 /*

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

panic: ata_dmasetup: transfer active on this device!

2002-12-03 Thread Ian Dowse


Hi Søren,

I get the above panic every few days when resuming, especially if
the disk was active while the laptop was suspending - it's easy to
reproduce by starting some disk-intensive activity and then hitting
the suspend button. I see that IWASAKI-san posted patches for this
a few months ago - do you have any plans to incorporate his work?


http://docs.freebsd.org/cgi/getmsg.cgi?fetch=814727+0+archive/2002/freebsd-current/20020908.freebsd-current

http://docs.freebsd.org/cgi/getmsg.cgi?fetch=822137+0+archive/2002/freebsd-current/20020908.freebsd-current

ata0: resetting devices ..
done
ata1: resetting devices ..
done
panic: ata_dmasetup: transfer active on this device!

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: unkillable process - 'mdconfig -t vnode' on small file

2002-11-30 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Michal 
Mertl writes:
>Subject says it all.

Fixed in md.c revision 1.74 - this was discussed here a few days
ago, but I was just waiting for approval to commit the fix.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: MD broken in current

2002-11-28 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Hiten Pandya writ
es:
>Is anyone planning to take this task, because, I think its important
>that it is fixed.  Or should it be put on the 5.0-todo list?  If not, we
>should put it in the BUGS section of mdconfig/ or the md(4) manual page.
>IMO.

I've tested the fix, and I'm just waiting for re@ approval to commit
it.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: MD broken in current

2002-11-27 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Bruce Evans writes:
>On Wed, 27 Nov 2002, Ian Dowse wrote:
>> I think moving the line
>>
>>  tsleep(sc, PRIBIO, "mdwait", 0);
>>
>> to just after the following `if' statement may do the trick. If the
>
>Wouldn't Giant locking prevent races here?  There is no locking in
>sight for the ioctl, but ioctl() holds Giant.

Yes, but mddestroy() assumes that the kthread is waiting in the
"mdwait" tsleep() when it calls wakeup(). That won't be true if the
kthread has not yet had a chance to run, or if it is waiting to
acquire Giant before entering the main loop (or if anything it calls
drops Giant). Moving the check of the MD_SHUTDOWN to before the
tsleep should catch all of these cases, and Giant ensures that the
wakeup() does not occur after the flag is tested but before the
tsleep().

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: MD broken in current

2002-11-27 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Bruce Evans writes:
>Better fix mddestroy().  I don't know why it hangs ... I guess it is
>because it is called before initialization is completed in mdinit(),
>and there aren't enough state checks in mddestroy().

I think moving the line

tsleep(sc, PRIBIO, "mdwait", 0);

to just after the following `if' statement may do the trick. If the
wakeup() from mddestroy() comes in before md_kthread() gets to the
main loop, then it would be missed. I think jhb posted a better way
of synchronising with kthreads during their destruction, but I
haven't found the time to look into that yet.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: [acpi-jp 1933] Re: acpid implementation?

2002-11-09 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Takanori Watanabe writes:
>==
>Next way is that make /dev/acpictl node that can open
>exclusively and catch the power event by it, like apmd.
>==
>
>This way requires that the event reading proceess should 
>be only one, so we need another device node to read event.

Yes, exactly - I think that your suggestion of extending /dev/devctl
to support device-specific events to be handled by devd is a much
nicer solution though.

>options PSM_HOOKRESUME  #hook the system resume event, useful
>#for some laptops
>options PSM_RESETAFTERSUSPEND   #reset the device at the resume event
>
>will resolve your problem without the patch.

Cool, thanks. I didn't know those options existed - I'll try them
out next time I reboot.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: [acpi-jp 1925] Re: acpid implementation?

2002-11-09 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Takanori Watanabe writes:
>It is obious there will be good if we have a way to catch power 
>event from userland.
>
>I have some ideas to implement it.
>One way is implement with kqueue(2) and /dev/acpi to
>get power events. This way does not require daemons
>to wait the event exclusively. Each process that requires
>to get power event can handle it by the interface.
>I wrote the experimental code a years ago.

I've been using the following far-from-ideal patch for a while now -
it just supplies binary integers to /dev/acpi whenever the sleep
state changes. The choice of encoding of data is stupid, and the
acpiread() doesn't do blocking - I just use it in a script like

while :; do
sleep 5
acpidat="`wc -c < /dev/acpi`"
if [ "$acpidat" -gt 0 ]; then
killall -HUP moused
fi
done

to send a SIGHUP to moused after a resume, which seems to be necessary
on my Vaio C1.

Ian

Index: acpi.c
===
RCS file: /dump/FreeBSD-CVS/src/sys/dev/acpica/acpi.c,v
retrieving revision 1.80
diff -u -r1.80 acpi.c
--- acpi.c  31 Oct 2002 20:23:41 -  1.80
+++ acpi.c  9 Nov 2002 20:20:10 -
@@ -32,6 +32,7 @@
 #include "opt_acpi.h"
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -42,6 +43,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -69,16 +71,18 @@
 
 static d_open_tacpiopen;
 static d_close_t   acpiclose;
+static d_read_tacpiread;
 static d_ioctl_t   acpiioctl;
+static d_poll_tacpipoll;
 
 #define CDEV_MAJOR 152
 static struct cdevsw acpi_cdevsw = {
 acpiopen,
 acpiclose,
-noread,
+acpiread,
 nowrite,
 acpiioctl,
-nopoll,
+acpipoll,
 nommap,
 nostrategy,
 "acpi",
@@ -1327,6 +1331,9 @@
}
 
sc->acpi_sstate = state;
+   if (sc->acpi_usereventq_len < ACPI_USER_EVENTQ_LEN)
+   sc->acpi_usereventq[sc->acpi_usereventq_len++] = state;
+   selwakeup(&sc->acpi_selp);
sc->acpi_sleep_disabled = 1;
 
/*
@@ -1375,6 +1382,9 @@
AcpiLeaveSleepState((UINT8)state);
DEVICE_RESUME(root_bus);
sc->acpi_sstate = ACPI_STATE_S0;
+   if (sc->acpi_usereventq_len < ACPI_USER_EVENTQ_LEN)
+   sc->acpi_usereventq[sc->acpi_usereventq_len++] = ACPI_STATE_S0;
+   selwakeup(&sc->acpi_selp);
acpi_enable_fixed_events(sc);
break;
 
@@ -1808,6 +1818,35 @@
 return(0);
 }
 
+int
+acpiread(dev_t dev, struct uio *uio, int flag)
+{
+struct acpi_softc  *sc;
+intbytes, error, events, i;
+
+ACPI_LOCK;
+
+sc = dev->si_drv1;
+
+error = 0;
+if (uio->uio_resid >= sizeof(int) && sc->acpi_usereventq_len > 0) {
+   events = sc->acpi_usereventq_len;
+   if (events > uio->uio_resid / sizeof(int))
+   events = uio->uio_resid / sizeof(int);
+   bytes = events * sizeof(int);
+   error = uiomove((caddr_t)sc->acpi_usereventq, bytes, uio);
+   if (!error) {
+   for (i = 0; i < sc->acpi_usereventq_len - events; i++)
+   sc->acpi_usereventq[i] = sc->acpi_usereventq[i + events];
+   sc->acpi_usereventq_len -= events;
+   }
+}
+
+ACPI_UNLOCK;
+
+return (error);
+}
+
 static int
 acpiioctl(dev_t dev, u_long cmd, caddr_t addr, int flag, d_thread_t *td)
 {
@@ -1871,6 +1910,25 @@
 out:
 ACPI_UNLOCK;
 return(error);
+}
+
+static  int
+acpipoll(dev_t dev, int events, d_thread_t *td)
+{
+struct acpi_softc  *sc;
+intrevents;
+
+ACPI_LOCK;
+sc = dev->si_drv1;
+
+revents = events & (POLLOUT | POLLWRNORM);
+if ((events & (POLLIN | POLLRDNORM)) && sc->acpi_usereventq_len > 0) {
+   revents |= (POLLIN | POLLRDNORM);
+   selrecord(td, &sc->acpi_selp);
+}
+
+ACPI_UNLOCK;
+return (revents);
 }
 
 static int
Index: acpivar.h
===
RCS file: /dump/FreeBSD-CVS/src/sys/dev/acpica/acpivar.h,v
retrieving revision 1.37
diff -u -r1.37 acpivar.h
--- acpivar.h   31 Oct 2002 17:58:38 -  1.37
+++ acpivar.h   9 Nov 2002 20:20:10 -
@@ -30,6 +30,7 @@
 
 #include "bus_if.h"
 #include 
+#include 
 #include 
 #if __FreeBSD_version >= 50
 #include 
@@ -50,6 +51,11 @@
 intacpi_enabled;
 intacpi_sstate;
 intacpi_sleep_disabled;
+
+#define ACPI_USER_EVENTQ_LEN   4
+intacpi_usereventq[ACPI_USER_EVENTQ_LEN];
+intacpi_usereventq_len;
+struct selinfo acpi_selp;
 
 struct sysctl_ctx_list acpi_sysctl_ctx;
 struct sysctl_oid  *acpi_sysctl_tree;


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscr

Re: kern/42417 cannot probe Olympus digital camera, "C-1"

2002-10-31 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Nate Lawson wri
tes:
>I looked at the change and it seems good.  Can someone more familiar with
>the USB system verify this?

Done - I have a C-1 here, so I was able to test it - obviously I haven't
accessed the camera from -current in a while!

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

ELCR in PCI-ISA bridge not getting set on resume

2002-10-28 Thread Ian Dowse


Since starting to use -current with ACPI on a Sony C1 laptop, I
noticed that after resume, occasionally IRQ 9 would get stuck and
not deliver any interrupts. IRQ 9 is shared by sound, USB and the
pccard slot. It turned out that something was not saving the ELCR
edge/level control registers in the PCI-ISA bridge, so on resume
IRQ 9 was configured in edge-triggered mode, making interrupt loss
inevitable.

The patch below makes the "isab" driver save and restore the ELCR
around suspends on the Intel 82371AB. Any comments on whether this
is the right way or the right place to solve the problem?

Ian

Index: isa_pci.c
===
RCS file: /dump/FreeBSD-CVS/src/sys/dev/pci/isa_pci.c,v
retrieving revision 1.6
diff -u -r1.6 isa_pci.c
--- isa_pci.c   21 Dec 2001 01:28:46 -  1.6
+++ isa_pci.c   29 Oct 2002 01:01:33 -
@@ -37,21 +37,36 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
+#include 
 
 #include 
 #include 
 
+#defineELCR_IOADDR 0x4d0   /* Interrupt Edge/Level Control Registers */
+#defineELCR_IOLEN  2
+
+struct isab_softc {
+struct resource *elcr_res;
+u_char saved_elcr[ELCR_IOLEN];
+};
+
 static int isab_probe(device_t dev);
 static int isab_attach(device_t dev);
+static int isab_detach(device_t dev);
+static int isab_resume(device_t dev);
+static int isab_suspend(device_t dev);
 
 static device_method_t isab_methods[] = {
 /* Device interface */
 DEVMETHOD(device_probe,isab_probe),
 DEVMETHOD(device_attach,   isab_attach),
+DEVMETHOD(device_detach,   isab_detach),
 DEVMETHOD(device_shutdown, bus_generic_shutdown),
-DEVMETHOD(device_suspend,  bus_generic_suspend),
-DEVMETHOD(device_resume,   bus_generic_resume),
+DEVMETHOD(device_suspend,  isab_suspend),
+DEVMETHOD(device_resume,   isab_resume),
 
 /* Bus interface */
 DEVMETHOD(bus_print_child, bus_generic_print_child),
@@ -68,7 +83,7 @@
 static driver_t isab_driver = {
 "isab",
 isab_methods,
-0,
+sizeof(struct isab_softc),
 };
 
 static devclass_t isab_devclass;
@@ -143,14 +158,82 @@
 isab_attach(device_t dev)
 {
 device_t   child;
+struct isab_softc *sc = device_get_softc(dev);
+int error, rid;
 
 /*
  * Attach an ISA bus.  Note that we can only have one ISA bus.
  */
 child = device_add_child(dev, "isa", 0);
-if (child != NULL)
-   return(bus_generic_attach(dev));
+if (child != NULL) {
+   error = bus_generic_attach(dev);
+   if (error)
+return (error);
+}
+
+switch (pci_get_devid(dev)) {
+case 0x71108086: /* Intel 82371AB */
+   /*
+* Sometimes the ELCR (Edge/Level Control Register) is not restored
+* correctly on resume by the BIOS, so we handle it ourselves.
+*/
+   rid = 0;
+   bus_set_resource(dev, SYS_RES_IOPORT, rid, ELCR_IOADDR, ELCR_IOLEN);
+   sc->elcr_res = bus_alloc_resource(dev, SYS_RES_IOPORT, &rid, 0, ~0, 1,
+   RF_ACTIVE);
+   if (sc->elcr_res == NULL)
+   device_printf(dev, "failed to allocate ELCR resource\n");
+break;
+}
 
 return(0);
 }
 
+static int
+isab_detach(device_t dev)
+{
+struct isab_softc *sc = device_get_softc(dev);
+
+if (sc->elcr_res != NULL)
+   bus_release_resource(dev, SYS_RES_IOPORT, 0, sc->elcr_res);
+
+ return (bus_generic_detach(dev));
+}
+
+static int
+isab_suspend(device_t dev)
+{
+struct isab_softc *sc = device_get_softc(dev);
+bus_space_tag_t bst;
+bus_space_handle_t bsh;
+int i;
+
+/* Save the ELCR if required. */
+if (sc->elcr_res != NULL) {
+   bst = rman_get_bustag(sc->elcr_res);
+   bsh = rman_get_bushandle(sc->elcr_res);
+   for (i = 0; i < ELCR_IOLEN; i++)
+   sc->saved_elcr[i] = bus_space_read_1(bst, bsh, i);
+}
+
+return (bus_generic_suspend(dev));
+}
+
+static int
+isab_resume(device_t dev)
+{
+struct isab_softc *sc = device_get_softc(dev);
+bus_space_tag_t bst;
+bus_space_handle_t bsh;
+int i;
+
+/* Restore the ELCR if required. */
+if (sc->elcr_res != NULL) {
+   bst = rman_get_bustag(sc->elcr_res);
+   bsh = rman_get_bushandle(sc->elcr_res);
+   for (i = 0; i < ELCR_IOLEN; i++)
+   bus_space_write_1(bst, bsh, i, sc->saved_elcr[i]);
+}
+
+return (bus_generic_resume(dev));
+}


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Crash again

2002-10-24 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Vallo Kallaste writes:
>The same kernel compiled for purposes of smbfs debugging crashed
>again. I had X, make -j2 running and no smbfs mounts. For what it's
>worth, the system did hang hard (no interrupts) some minutes before
>the aforementioned crash and I had to reboot by hand.

>From the trace, we are recursing on ufs_inactive() because it needs
to grab and release a vnode reference from within vput()->VOP_INACTIVE(),
so the second vput() causes another call to VOP_INACTIVE. This looks
like something the VINACTIVE patch I posted a while ago would fix:

http://www.maths.tcd.ie/~iedowse/FreeBSD/vinactive.diff

(Sorry, I haven't updated it, so it probably needs manual merging)
See also the comments by Don Lewis on this list ("Re: nfs_inactive()
bug? -> panic: lockmgr:").

Kirk, is this vput() recursion expected? If so, something like the
patch above should ensure that the second vput() does not call
VOP_INACTIVE() again.

Ian

>(kgdb) where
>#0  doadump () at ../../../kern/kern_shutdown.c:223
>#1  0xc0236e8a in boot (howto=260) at ../../../kern/kern_shutdown.c:355
>#2  0xc0237147 in panic () at ../../../kern/kern_shutdown.c:508
>#3  0xc027e982 in bwrite (bp=0xce3e92dc) at ../../../kern/vfs_bio.c:796
>#4  0xc027f039 in bawrite (bp=0x0) at ../../../kern/vfs_bio.c:1085
>#5  0xc033b16f in ffs_fsync (ap=0xd795958c) at ../../../ufs/ffs/ffs_vnops.c:23
>6
>#6  0xc033a499 in ffs_sync (mp=0xc4057400, waitfor=2, cred=0xc13c3e80, 
>td=0xc0436be0) at vnode_if.h:612
>#7  0xc02939e8 in sync (td=0xc0436be0, uap=0x0)
>at ../../../kern/vfs_syscalls.c:130
>#8  0xc0236a6b in boot (howto=256) at ../../../kern/kern_shutdown.c:264
>#9  0xc0237147 in panic () at ../../../kern/kern_shutdown.c:508
>#10 0xc0228f80 in lockmgr (lkp=0xc45fd43c, flags=65543, interlkp=0xc45fd378, 
>td=0xc3eb70d0) at ../../../kern/kern_lock.c:433
>#11 0xc028640c in vop_stdlock (ap=0x0) at ../../../kern/vfs_default.c:279
>#12 0xc0349348 in ufs_vnoperate (ap=0x0) at ../../../ufs/ufs/ufs_vnops.c:2763
>#13 0xc0290d6a in vclean (vp=0xc45fd378, flags=8, td=0xc3eb70d0)
>at vnode_if.h:990
>#14 0xc029142a in vgonel (vp=0xc45fd378, td=0x0)
>at ../../../kern/vfs_subr.c:2665
>#15 0xc0291310 in vrecycle (vp=0xc45fd378, inter_lkp=0x0, td=0x0)
>at ../../../kern/vfs_subr.c:2620
>#16 0xc03413fc in ufs_inactive (ap=0x0) at ../../../ufs/ufs/ufs_inode.c:133
>#17 0xc0349348 in ufs_vnoperate (ap=0x0) at ../../../ufs/ufs/ufs_vnops.c:2763
>#18 0xc0290420 in vput (vp=0xc45fd378) at vnode_if.h:930
>#19 0xc033212d in handle_workitem_freeblocks (freeblks=0xc4d07500, flags=0)
>at ../../../ufs/ffs/ffs_softdep.c:2494
>#20 0xc03315f4 in softdep_setup_freeblocks (ip=0xc4d4c500, length=0, 
>flags=2048) at ../../../ufs/ffs/ffs_softdep.c:2077
>---Type  to continue, or q  to quit---
>#21 0xc0327938 in ffs_truncate (vp=0xc45fd378, length=0, flags=3072, cred=0x0,
> 
>td=0xc3eb70d0) at ../../../ufs/ffs/ffs_inode.c:271
>#22 0xc03412c8 in ufs_inactive (ap=0x0) at ../../../ufs/ufs/ufs_inode.c:100
>#23 0xc0349348 in ufs_vnoperate (ap=0x0) at ../../../ufs/ufs/ufs_vnops.c:2763
>#24 0xc0290420 in vput (vp=0xc45fd378) at vnode_if.h:930
>#25 0xc03336c2 in handle_workitem_remove (dirrem=0xc4567140, xp=0x0)
>at ../../../ufs/ffs/ffs_softdep.c:3324
>#26 0xc032f0ed in process_worklist_item (matchmnt=0x0, flags=0)
>at ../../../ufs/ffs/ffs_softdep.c:727
>#27 0xc032eea0 in softdep_process_worklist (matchmnt=0x0)
>at ../../../ufs/ffs/ffs_softdep.c:624
>#28 0xc028f411 in sched_sync () at ../../../kern/vfs_subr.c:1739
>#29 0xc0222e14 in fork_exit (callout=0xc028f010 , arg=0x0, 
>frame=0x0) at ../../../kern/kern_fork.c:860

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: mozilla vs linux emulation in -current?

2002-10-23 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Ian Dowse writes
:
>IP, but we were throwing away the modified version). Commit if it
>works, and I'll look properly tomorrow. Sorry for the breakage.

With the one compile error fixed, this seemed to make `telnet 0.0.0.0'
work again, so I went ahead and checked it in.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: mozilla vs linux emulation in -current?

2002-10-23 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Peter Wemm writes:
>Has anybody else noticed this in -current?  Mozilla hangs for a minute or
>so at regular intervals..
>16:07:31.896548 216.145.52.172.20167 > 0.0.0.0.16001: S 1175926117:1175926117(

Sounds like something I may have broken... Need to sleep now, but
you could try the following. I think in_pcbconnect() used to do
some evil stuff where it would modify the supplied sockaddr,
tcp_connect was depending on this, and I failed to notice it
(in_pcbconnect maps a destination address of INADDR_ANY into a local
IP, but we were throwing away the modified version). Commit if it
works, and I'll look properly tomorrow. Sorry for the breakage.

Ian

Index: tcp_usrreq.c
===
RCS file: /dump/FreeBSD-CVS/src/sys/netinet/tcp_usrreq.c,v
retrieving revision 1.83
diff -u -r1.83 tcp_usrreq.c
--- tcp_usrreq.c21 Oct 2002 13:55:50 -  1.83
+++ tcp_usrreq.c24 Oct 2002 01:27:27 -
@@ -876,14 +876,14 @@
if (oinp != inp && (otp = intotcpcb(oinp)) != NULL &&
otp->t_state == TCPS_TIME_WAIT &&
(ticks - otp->t_starttime) < tcp_msl &&
-   (otp->t_flags & TF_RCVD_CC))
+   (otp->t_flags & TF_RCVD_CC)) {
+   inp->inp_faddr = oinp->inp_faddr;
+   inp->inp_fport = oinp->inp_fport;
otp = tcp_close(otp);
-   else
+   } else
return EADDRINUSE;
}
inp->inp_laddr = laddr;
-   inp->inp_faddr = sin->sin_addr;
-   inp->inp_fport = sin->sin_port;
in_pcbrehash(inp);

/* Compute window scaling to request.  */

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: -CURRENT running really slow under vmware2

2002-10-12 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Jim Pirzyk writes:
>I would think we need to at least patch current for this case.  Enclosed
>is a possible implementation.  Comments?

I think I tried this before, and puting the option in opt_cpu.h
does not work, because not all files that include atomic.h will
include opt_cpu.h. The other options referenced in atomic.h are all
in opt_global.h, so CPU_DISABLE_CMPXCHG needs to go there too (note
that the instruction is called cmpxchg, not cmpxfhg BTW).

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: [ GEOM tests ] disklabel warnings and vinum drives lost

2002-10-05 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, Poul-Henning Kamp writes:
>Make that _three_ bugs:  vinum opens devices directly at the cdevsw
>level, bypassing in the process the vnodes and specfs.

Here is a patch that makes it use vn_open/vn_close/VOP_IOCTL,
bringing it much closer to the way ccd(4) does things. I have only
lightly tested this so far - I saw one problem where a md(4)
vnode-backed device got stuck in mddestroy(), but I haven't tracked
down if that is related (the md vnode was just a file on a vinum-backed
filesystem).

Ian

Index: vinumio.c
===
RCS file: /dump/FreeBSD-CVS/src/sys/dev/vinum/vinumio.c,v
retrieving revision 1.76
diff -u -r1.76 vinumio.c
--- vinumio.c   5 Oct 2002 03:44:00 -   1.76
+++ vinumio.c   5 Oct 2002 14:12:56 -
@@ -51,33 +51,26 @@
 open_drive(struct drive *drive, struct thread *td, int verbose)
 {
 struct nameidata nd;
-struct cdevsw *dsw;/* pointer to 
cdevsw entry */
-int error;
+int flags;
 
 if (drive->flags & VF_OPEN)/* open already, */
return EBUSY;   /* don't do it again */
 
-NDINIT(&nd, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE, drive->devicename,
-curthread);
-error = namei(&nd);
-if (error)
-   return (error);
-if (!vn_isdisk(nd.ni_vp, &error)) {
-   NDFREE(&nd, 0);
-   return (error);
-}
-drive->dev = udev2dev(nd.ni_vp->v_rdev->si_udev, 0);
-NDFREE(&nd, 0);
-
-if (drive->dev == NULL)/* didn't find anything */
-   return ENODEV;
-
-drive->dev->si_iosize_max = DFLTPHYS;
-dsw = devsw(drive->dev);
-if (dsw == NULL)
-   drive->lasterror = ENOENT;
-else
-   drive->lasterror = (dsw->d_open) (drive->dev, FWRITE | FREAD, 0, NULL);
+drive->devvp = NULL;
+NDINIT(&nd, LOOKUP, FOLLOW, UIO_SYSSPACE, drive->devicename, td);
+flags = FREAD | FWRITE;
+drive->lasterror = vn_open(&nd, &flags, 0);
+if (drive->lasterror == 0) {
+   (void) vn_isdisk(nd.ni_vp, &drive->lasterror);
+   if (drive->lasterror != 0 && vrefcnt(nd.ni_vp) > 1)
+   drive->lasterror = EBUSY;
+   VOP_UNLOCK(nd.ni_vp, 0, td);
+   NDFREE(&nd, NDF_ONLY_PNBUF);
+   if (drive->lasterror == 0)
+   drive->devvp = nd.ni_vp;
+   else
+   (void) vn_close(nd.ni_vp, flags, td->td_ucred, td);
+}
 
 if (drive->lasterror != 0) {   /* failed */
drive->state = drive_down;  /* just force it down */
@@ -85,8 +78,11 @@
log(LOG_WARNING,
"vinum open_drive %s: failed with error %d\n",
drive->devicename, drive->lasterror);
-} else
+} else {
+   drive->dev = vn_todev(drive->devvp);
+   drive->dev->si_iosize_max = DFLTPHYS;
drive->flags |= VF_OPEN;/* we're open now */
+}
 
 return drive->lasterror;
 }
@@ -145,6 +141,9 @@
 int
 init_drive(struct drive *drive, int verbose)
 {
+struct thread *td;
+
+td = curthread;
 if (drive->devicename[0] != '/') {
drive->lasterror = EINVAL;
log(LOG_ERR, "vinum: Can't open drive without drive name\n");
@@ -154,17 +153,17 @@
 if (drive->lasterror)
return drive->lasterror;
 
-drive->lasterror = (*devsw(drive->dev)->d_ioctl) (drive->dev,
+drive->lasterror = VOP_IOCTL(drive->devvp,
DIOCGSECTORSIZE,
(caddr_t) & drive->sectorsize,
FREAD,
-   curthread);
+   td->td_ucred, td);
 if (drive->lasterror == 0)
-   drive->lasterror = (*devsw(drive->dev)->d_ioctl) (drive->dev,
+   drive->lasterror = VOP_IOCTL(drive->devvp,
DIOCGMEDIASIZE,
(caddr_t) & drive->mediasize,
FREAD,
-   curthread);
+   td->td_ucred, td);
 if (drive->lasterror) {
if (verbose)
log(LOG_WARNING,
@@ -211,14 +210,16 @@
 void
 close_locked_drive(struct drive *drive)
 {
+struct thread *td;
 int error;
 
+td = curthread;
 /*
  * If we can't access the drive, we can't flush
  * the queues, which spec_close() will try to
  * do.  Get rid of them here first.
  */
-error = (*devsw(drive->dev)->d_close) (drive->dev, 0, 0, NULL);
+error = vn_close(drive->devvp, FREAD | FWRITE, td->td_ucred, td);
 drive->flags &= ~VF_OPEN;  /* no longer open */
 if (drive->lasterror == 0)
drive->lasterror = error;
@@ -561,11 +562,13 @@
 int error;
 int written_config;/* set when we 
first write the config to disk */
 int driveno;
+struct thread *td;
 struct drive *drive;   /* point to current drive 
info */
 struct vinum_hdr *vhdr;

Re: [ GEOM tests ] disklabel warnings and vinum drives lost

2002-10-04 Thread Ian Dowse



[CCs trimmed]

>The divide by zero problem seems to be caused by an interaction
>between two bugs: GEOM refuses to return the sector size because
...
>The next failure I get is:
>
>   Can't write config to /dev/da1s1d, error 45 (EOPNOTSUPP)

This turns out to be vinum doing a DIOCWLABEL to make the label
writable before writing its configuration, but GEOM does not support
that ioctl I presume. It should be safe to ignore these DIOCWLABEL
ioctl return values as the actual writing of the vinum data should
give a suitable error if making the label writable fails and is
important. The patch below is Robert's patch with all 3 other issues
fixed, and together, this seems to be enough to make vinum work
again.

Ian

Index: vinumio.c
===
RCS file: /dump/FreeBSD-CVS/src/sys/dev/vinum/vinumio.c,v
retrieving revision 1.75
diff -u -r1.75 vinumio.c
--- vinumio.c   21 Aug 2002 23:39:51 -  1.75
+++ vinumio.c   5 Oct 2002 02:40:18 -
@@ -50,92 +50,25 @@
 int
 open_drive(struct drive *drive, struct thread *td, int verbose)
 {
-int devmajor;  /* major devs for disk 
device */
-int devminor;  /* minor devs for disk 
device */
-int unit;
-char *dname;
+struct nameidata nd;
 struct cdevsw *dsw;/* pointer to 
cdevsw entry */
+int error;
 
-if (bcmp(drive->devicename, "/dev/", 5))   /* device name doesn't 
start with /dev */
-   return ENOENT;  /* give up */
 if (drive->flags & VF_OPEN)/* open already, */
return EBUSY;   /* don't do it again */
 
-/*
- * Yes, Bruce, I know this is horrible, but we
- * don't have a root filesystem when we first
- * try to do this.  If you can come up with a
- * better solution, I'd really like it.  I'm
- * just putting it in now to add ammuntion to
- * moving the system to devfs.
- */
-dname = &drive->devicename[5];
-drive->dev = NULL; /* no device yet */
-
-/* Find the device */
-if (bcmp(dname, "ad", 2) == 0) /* IDE disk */
-   devmajor = 116;
-else if (bcmp(dname, "wd", 2) == 0)/* IDE disk */
-   devmajor = 3;
-else if (bcmp(dname, "da", 2) == 0)
-   devmajor = 13;
-else if (bcmp(dname, "vn", 2) == 0)
-   devmajor = 43;
-else if (bcmp(dname, "md", 2) == 0)
-   devmajor = 95;
-else if (bcmp(dname, "ar", 2) == 0)
-devmajor = 157;
-else if (bcmp(dname, "amrd", 4) == 0) {
-   devmajor = 133;
-   dname += 2;
-} else if (bcmp(dname, "mlxd", 4) == 0) {
-   devmajor = 131;
-   dname += 2;
-} else if (bcmp(dname, "idad", 4) == 0) {
-   devmajor = 109;
-   dname += 2;
-} else if (bcmp(dname, "twed", 4) == 0) {   /* 3ware raid */
-  devmajor = 147;
-  dname += 2;
-} else
-  return ENODEV;
-dname += 2;/* point past */
-
-/*
- * Found the device.  We can expect one of
- * two formats for the rest: a unit number,
- * then either a partition letter for the
- * compatiblity partition (e.g. h) or a
- * slice ID and partition (e.g. s2e).
- * Create a minor number for each of them.
- */
-unit = 0;
-while ((*dname >= '0') /* unit number */
-&&(*dname <= '9')) {
-   unit = unit * 10 + *dname - '0';
-   dname++;
-}
-
-if (*dname == 's') {   /* slice */
-   if (((dname[1] < '1') || (dname[1] > '4'))  /* invalid slice */
-   ||((dname[2] < 'a') || (dname[2] > 'h')))   /* or invalid partition */
-   return ENODEV;
-   devminor = ((unit & 31) << 3)   /* unit */
-   +(dname[2] - 'a')   /* partition */
-   +((dname[1] - '0' + 1) << 16)   /* slice */
-   +((unit & ~31) << 16);  /* high-order unit bits */
-} else {   /* compatibility partition 
*/
-   if ((*dname < 'a') || (*dname > 'h'))   /* or invalid partition */
-   return ENODEV;
-   devminor = (*dname - 'a')   /* partition */
-   +((unit & 31) << 3) /* unit */
-   +((unit & ~31) << 16);  /* high-order unit bits */
+NDINIT(&nd, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE, drive->devicename,
+curthread);
+error = namei(&nd);
+if (error)
+   return (error);
+if (!vn_isdisk(nd.ni_vp, &error)) {
+   NDFREE(&nd, 0);
+   return (error);

Re: [ GEOM tests ] disklabel warnings and vinum drives lost

2002-10-04 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Robe
rt Watson writes:
>However, here's a patch that makes Vinum use namei() to rely on devfs to
>locate requested devices instead of parsing the device name and guessing
>the device number (incorrectly with GEOM).  Unfortunately, I almost
>immediately run into a divide by zero due to a zero sector size.  Jeff
>Roberson mentioned to me he had a fix for this bug that he sent to Greg,
>but that was never committed.

The divide by zero problem seems to be caused by an interaction
between two bugs: GEOM refuses to return the sector size because
the flags passed to d_open in vinum's open_drive() do not include
FREAD. Then vinum clobbers the ioctl's non-zero error code by calling
close_drive() from init_drive(), so the latter ends up returning
zero even though it failed.

The next failure I get is:

Can't write config to /dev/da1s1d, error 45 (EOPNOTSUPP)

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: NFS hang on rmdir(2) with 5.0-current client, server

2002-10-03 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, Robe
rt Watson writes:
>> > It looks like the client is basically hung waiting for an RPC response.
>> > I'd be glad to provide more debugging information if someone can point me
>> > in the right direction.
>> 
>> I haven't seen this seen this problem with a 5.0-CURRENT client and a
>> 4.7-PRERELEASE server, so as near as I can tell the client side isn't
>> totally hosed. 
>
>Interesting observation: rm on files, and rmdir on empty directories
>doesn't trigger it, but attempt to rmdir on a non-empty directory does. 

This an NFSv2 mount, and I think the problem is specific to NFSv2.
Something like the following patch should fix it. I probably missed
this in revision 1.112 when fixing some similar issues in other
server op functions. See the commit message for that revision for
further details.

Ian

Index: nfs_serv.c
===
RCS file: /dump/FreeBSD-CVS/src/sys/nfsserver/nfs_serv.c,v
retrieving revision 1.123
diff -u -r1.123 nfs_serv.c
--- nfs_serv.c  25 Sep 2002 02:39:39 -  1.123
+++ nfs_serv.c  3 Oct 2002 08:30:49 -
@@ -2905,10 +2905,9 @@
if (dirp)
diraft_ret = VOP_GETATTR(dirp, &diraft, cred, td);
nfsm_reply(NFSX_WCCDATA(v3));
-   if (v3) {
+   error = 0;
+   if (v3)
nfsm_srvwcc_data(dirfor_ret, &dirfor, diraft_ret, &diraft);
-   error = 0;
-   }
/* fall through */
 
 nfsmout:


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: nfs_inactive() bug? -> panic: lockmgr: locking against myself

2002-09-12 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Terry Lambert writes:
>Ian Dowse wrote:
>> And I've just remembered a fifth :-) I think the old BSD code had
>> both an `open' count and a reference count. The open count is a
>> count of the real users of the vnode (it is what ufs_inactive really
>> wants to compare against 0), and the reference count is just for
>> places that you don't want the vnode to be recycled or destroyed.
>> This was probably lost when the encumbered BSD sources were rewritten.
>
>No, this went away with the vnode locking changes; it was in the
>4.4 code, for sure.  I think references are the correct thing here,
>and SunOS seems to agree, since that's how they implement, too.  8-).

I seem to have mis-remembered the details anyway :-) It doesn't
look as if there ever was ever the `open' count that I mentioned
above. Maybe I was just thinking that it would be a good way to
solve the issues of matching VOP_CLOSEs with VOP_OPENs, since there
are many cases in the kernel that do not guarantee to do a VOP_CLOSE
for each VOP_OPEN that was performed.

Handling the dropping of a last reference is always tricky to get
right when complex operations can be performed from the reference
dropping function (especially where that includes adding and then
removing references multiple times). It's even harder to do it in
a way that continues to catch missing or extra references caused
by bugs in the functions called when the reference count hits zero.

For example, if you hold the reference count at 1 while calling the
cleanup function, it allows that function to safely add and drop
references, but if that cleanup function has a bug that drops one
too many references then you end up recursing instead of detecting
it as a negative reference count. I've found in some other code
that it works reasonably well to leave the reference count at zero,
but set a flag to stop further 1->0 transitions from retriggering
the cleanup. Obviously other approaches will work too.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: nfs_inactive() bug? -> panic: lockmgr: locking against myself

2002-09-12 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, Don Lewis writes:
>After looking at ufs_inactive(), I'd like to add a fourth proposal

And I've just remembered a fifth :-) I think the old BSD code had
both an `open' count and a reference count. The open count is a
count of the real users of the vnode (it is what ufs_inactive really
wants to compare against 0), and the reference count is just for
places that you don't want the vnode to be recycled or destroyed.
This was probably lost when the encumbered BSD sources were rewritten.

At the time I was looking at it last, I remember thinking that the
open count would allow vrele/vput to keep the reference count at 1
during the VOP_INACTIVE() call, which is what you were proposing.
It would also allow us to fix the problem of many places not matching
each VOP_OPEN() with a VOP_CLOSE(). I suspect we could clean up a
lot of related problems if the open count was brought back.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: nfs_inactive() bug? -> panic: lockmgr: locking against myself

2002-09-11 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, Don Lewis writes:
>
>A potentially better solution just occurred to me.  It looks like it
>would be better if vrele() waited to decrement v_usecount until *after*
>the call to VOP_INACTIVE() (and after the call to VI_LOCK()).  If that
>were done, nfs_inactive() wouldn't need to muck with v_usecount at all.

I've looked at this before; I think some filesystems (ufs anyway)
depend on v_usecount being 0 when VOP_INACTIVE() is called. The
patch I have had lying around for quite a while is below. It adds
a vnode flag to avoid recursion into the last-reference handling
code in vrele/vput, which is the real problem.

It also guarantees that a vnode will not be recycled during
VOP_INACTIVE(), so the nfs code no longer needs to hold an extra
reference in the first place. The flag manipulation code got a bit
messy after Jeff's vnode flag splitting work, so the patch could
probably be improved.

Whatever way this is done, we should try to avoid adding more hacks
to the nfs_inactive() code anyway.

Ian

Index: sys/vnode.h
===
RCS file: /home/iedowse/CVS/src/sys/sys/vnode.h,v
retrieving revision 1.206
diff -u -r1.206 vnode.h
--- sys/vnode.h 1 Sep 2002 20:37:21 -   1.206
+++ sys/vnode.h 11 Sep 2002 11:06:46 -
@@ -220,6 +220,7 @@
 #defineVI_DOOMED   0x0080  /* This vnode is being recycled */
 #defineVI_FREE 0x0100  /* This vnode is on the freelist */
 #defineVI_OBJDIRTY 0x0400  /* object might be dirty */
+#defineVI_INACTIVE 0x0800  /* VOP_INACTIVE is in progress */
 /*
  * XXX VI_ONWORKLST could be replaced with a check for NULL list elements
  * in v_synclist.
@@ -377,14 +378,14 @@
 
 /* Requires interlock */
 #defineVSHOULDFREE(vp) \
-   (!((vp)->v_iflag & (VI_FREE|VI_DOOMED)) && \
+   (!((vp)->v_iflag & (VI_FREE|VI_DOOMED|VI_INACTIVE)) && \
 !(vp)->v_holdcnt && !(vp)->v_usecount && \
 (!(vp)->v_object || \
  !((vp)->v_object->ref_count || (vp)->v_object->resident_page_count)))
 
 /* Requires interlock */
 #define VMIGHTFREE(vp) \
-   (!((vp)->v_iflag & (VI_FREE|VI_DOOMED|VI_XLOCK)) && \
+   (!((vp)->v_iflag & (VI_FREE|VI_DOOMED|VI_XLOCK|VI_INACTIVE)) && \
 LIST_EMPTY(&(vp)->v_cache_src) && !(vp)->v_usecount)
 
 /* Requires interlock */
Index: nfsclient/nfs_node.c
===
RCS file: /home/iedowse/CVS/src/sys/nfsclient/nfs_node.c,v
retrieving revision 1.55
diff -u -r1.55 nfs_node.c
--- nfsclient/nfs_node.c11 Jul 2002 17:54:58 -  1.55
+++ nfsclient/nfs_node.c11 Sep 2002 11:06:46 -
@@ -289,21 +289,7 @@
} else
sp = NULL;
if (sp) {
-   /*
-* We need a reference to keep the vnode from being
-* recycled by getnewvnode while we do the I/O
-* associated with discarding the buffers unless we
-* are being forcibly unmounted in which case we already
-* have our own reference.
-*/
-   if (ap->a_vp->v_usecount > 0)
-   (void) nfs_vinvalbuf(ap->a_vp, 0, sp->s_cred, td, 1);
-   else if (vget(ap->a_vp, 0, td))
-   panic("nfs_inactive: lost vnode");
-   else {
-   (void) nfs_vinvalbuf(ap->a_vp, 0, sp->s_cred, td, 1);
-   vrele(ap->a_vp);
-   }
+   (void)nfs_vinvalbuf(ap->a_vp, 0, sp->s_cred, td, 1);
/*
 * Remove the silly file that was rename'd earlier
 */
Index: kern/vfs_subr.c
===
RCS file: /home/iedowse/CVS/src/sys/kern/vfs_subr.c,v
retrieving revision 1.401
diff -u -r1.401 vfs_subr.c
--- kern/vfs_subr.c 5 Sep 2002 20:46:19 -   1.401
+++ kern/vfs_subr.c 11 Sep 2002 11:06:46 -
@@ -829,7 +829,8 @@
for (count = 0; count < freevnodes; count++) {
vp = TAILQ_FIRST(&vnode_free_list);
 
-   KASSERT(vp->v_usecount == 0, 
+   KASSERT(vp->v_usecount == 0 &&
+   (vp->v_iflag & VI_INACTIVE) == 0,
("getnewvnode: free vnode isn't"));
 
TAILQ_REMOVE(&vnode_free_list, vp, v_freelist);
@@ -1980,8 +1981,8 @@
KASSERT(vp->v_writecount < vp->v_usecount || vp->v_usecount < 1,
("vrele: missed vn_close"));
 
-   if (vp->v_usecount > 1) {
-
+   if (vp->v_usecount > 1 ||
+   ((vp->v_iflag & VI_INACTIVE) && vp->v_usecount == 1)) {
vp->v_usecount--;
VI_UNLOCK(vp);
 
@@ -1991,13 +1992,20 @@
if (vp->v_usecount == 1) {
vp->v_usecount--;
/*
-* We must call VOP_INACTIVE with the node l

Re: fsck cannot find superblock

2002-09-04 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Bruce Evans writes:
>> * drop support for 4K block sizes completely, but this breaks
>>   backwards compatibility
>
>I use patches like the following for the sanity checks:

I think there may be other problems that are triggered by using <8k
blocks on -current too. Last time I tried 4k blocks (pre-UFS2), the
snapshot code would cause a panic when trying to allocate a single
4k block to fit the 8k superblock (the machine then got stuck in a
reboot-fsck-panic cycle until interrupted and manually fsck'd).

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Is anyone else having trouble with dump(8) on -current?

2002-08-11 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, Bruce Evans writes:
>
>I don't know how open() of a disk device can be interrupted by a signal
>in practice.  Most disk operations don't check for signals.

Does the PCATCH tsleep in diskopen() that I mentioned seem a likely
candidate? Anyway, below is a simple program that reproduces the
EINTR error fairly reliably for me when run on disk devices.

Ian

#include 
#include 
#include 
#include 
#include 

void
handler(int sig) {
}

int
main(int argc, char **argv)
{
int fd, i;
if (argc < 2)
errx(1, "Usage: %s device", argv[0]);
fork();
fork();
fork();
fork();

signal(SIGUSR1, handler);
sleep(1);

for (i = 0; i < 200; i++) {
killpg(0, SIGUSR1);
if ((fd = open(argv[1], O_RDONLY)) < 0)
err(1, "%s", argv[1]);
close(fd);
}
return 0;
}

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Is anyone else having trouble with dump(8) on -current?

2002-08-10 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Alexander Leiding
er writes:
>Have a look at Message-ID: <[EMAIL PROTECTED]>
>(should be in the archive of audit).

Ah, I had forgotten about that -audit thread.

>Short: open shouldn't be able to return EINTR in practice...
>
>My assumptions:
> - Bruce hasn't made a mistake
> - something broke in the kernel (either for a "short" period of
>   time, or it's still broken), so we should look for the real
>   problem instead

I had a quick look yesterday, and I found a PCATCH tsleep call in
diskopen(), though I do not know if this is the one that affects
dump. Does open(2) need to loop on ERESTART? Currently it just
maps ERESTART to EINTR and returns the error.

We should fix this broken dump behaviour anyway - I don't think it
matters too much for now whether it is fixed in userland or the
kernel, as it will only affect the tiny set of applications that
receive signals while opening a disk device at the same time as
another open on the same device is occurring (I think).

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Is anyone else having trouble with dump(8) on -current?

2002-08-09 Thread Ian Dowse

[replying to an old message]

In message <[EMAIL PROTECTED]>, Alexander Leidi
nger writes:
>On  7 Mai, Benjamin Lewis wrote:
>>  |   DUMP: slave couldn't reopen disk: Interrupted system call
>
>Try the attached patch. I also have a similar patch for restore. I don't
>like the patch, I think I should use SA_RESTART with sigaction(), so
>think about this patch as a proof of concept (if it solves your
>problem).

I was just looking at PR bin/18319 when I remembered this message.
Many of the changes in your patch are not necessary I believe, as
read(2) will restart after a signal by default. How about just
fixing the open call that actually triggers the reported error? I
suspect that many of the other cases are either impossible or
extremely unlikely in practice. Could someone who can reproduce the
"couldn't reopen disk" error try the following?

Ian

Index: tape.c
===
RCS file: /dump/FreeBSD-CVS/src/sbin/dump/tape.c,v
retrieving revision 1.22
diff -u -r1.22 tape.c
--- tape.c  8 Jul 2002 00:29:23 -   1.22
+++ tape.c  9 Aug 2002 22:28:45 -
@@ -740,8 +740,11 @@
 * Need our own seek pointer.
 */
(void) close(diskfd);
-   if ((diskfd = open(disk, O_RDONLY)) < 0)
+   while ((diskfd = open(disk, O_RDONLY)) < 0) {
+   if (errno == EINTR)
+   continue;
quit("slave couldn't reopen disk: %s\n", strerror(errno));
+   }

/*
 * Need the pid of the next slave in the loop...

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: do we still need ufs/ffs/ffs_softdep_stub.c ?

2002-08-03 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, Luigi Rizzo writes:
>Hi,
>just got the following panic with today's -current sources and
>an oldish config file (one not having "options SOFTUPDATES"):

>panic(c026ecc1,c66e1b94,c01ff565,c1cda000,0) at panic+0x7c
>softdep_slowdown(c1cda000,0,0,,2) at softdep_slowdown+0xd
>ffs_truncate(c1cda000,0,0,c00,0) at ffs_truncate+0x81

>so the question is, do we still need ffs_softdep_stub.c ? In any
>case, getting an explicit panic does not really sound right...

The bug is in ffs_truncate() - it should not be calling softdep
functions on non-softdep filesystems. The panic is there to catch
exactly this kind of bug.

I think the following patch should fix it.

Ian

Index: ffs_inode.c
===
RCS file: /dump/FreeBSD-CVS/src/sys/ufs/ffs/ffs_inode.c,v
retrieving revision 1.81
diff -u -r1.81 ffs_inode.c
--- ffs_inode.c 19 Jul 2002 07:29:38 -  1.81
+++ ffs_inode.c 3 Aug 2002 11:05:43 -
@@ -173,7 +173,7 @@
 * soft updates below.
 */
needextclean = 0;
-   softdepslowdown = softdep_slowdown(ovp);
+   softdepslowdown = DOINGSOFTDEP(ovp) && softdep_slowdown(ovp);
extblocks = 0;
datablocks = DIP(oip, i_blocks);
if (fs->fs_magic == FS_UFS2_MAGIC && oip->i_din2->di_extsize > 0) {


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: NEWCARD support for Linksys Ethernet broken?

2002-07-18 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, "M. Warner Losh" writes:
>Used to work for me, but something seems to have busted it in recent
>versions of the kernel.  So NEWCARD appears to be broken for ata, sio
>and ed.  Wonderful.  These all used to work at one point in the past.

I think the "ed" problem is that without pccardd, the 0x8 flag
is no longer being passed to the driver, so it doesn't try probing
it as a Linksys card (I haven't checked for sure, but that would
be consistent with it being detected as an NE2000).

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: bdwrite: buffer is not busy

2002-07-12 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, "Andrey A. Chernov" writes:
>I see this panic constantly during last month or two, UP machine, no 
>softupdates. Anybody else saw it too? Any ideas?

The "buffer is not busy" panic is usually a secondary panic that
occurs while trying to sync the disks after a different panic.  If
possible, try to get the first panic message, or ideally a stack
trace.

I think (but I've never checked for sure) that the "buffer is not
busy" panics occur because of the following code in lockmgr(),
combined with later sanity checks:

if (panicstr != NULL) {
mtx_unlock(lkp->lk_interlock);
return (0);
}

This basically causes all lockmgr locks to be unconditionally and
immediately granted after a panic without actually marking the lock
as locked. Not surprisingly, this causes any lock state sanity
checks later to fail. The original intention was probably to avoid
deadlocking while syncing the disks, but a virtually guaranteed
secondary panic isn't helpful either. It might be worth checking
if a "return (EBUSY);" or a "lkp->lk_flags |= LK_HAVE_EXCL;
lkp->lk_lockholder = pid;" in here would do better. The alternative
is to make "kern.sync_on_panic=0" the default.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: -CURRENT trashes disk label

2002-07-11 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, David Schultz writes:
>I just made world on -CURRENT (cvsup a few hours ago), booted
>using a new GENERIC kernel and ran mergemaster.  Before I
>installed world, I mounted the root partition for my more stable
>development environment (4.6-RELEASE) to copy my firewall rules
>over.  In summary:

># fsck /dev/ad1s1a
>** /dev/ad1s1a
>BAD SUPER BLOCK: VALUES IN SUPER BLOCK DISAGREE WITH THOSE IN FIRST ALTERNATE
>/dev/ad1s1a: INCOMPLETE LABEL: type 4.2BSD fsize 0, frag 0, cpg 0, size 524288

You just need to "fsck -b16 /dev/ad1s1a", or alternatively upgrade
to the latest -STABLE fsck, which fixes this issue. There are a few
new superblock fields in use in -CURRENT that trigger some unnecessary
fsck sanity checks.

The other thing that causes scary-looking errors when moving disks
back and forth between -CURRENT and -STABLE is when the snapshot
used by -CURRENT's fsck gets left behind if you reboot during the
background fsck and then boot -STABLE.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: dump(8) is hosed

2002-07-07 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, Don Lewis writes:
>
>I was finally finally able to reproduce this by creating a large file
>before doing the dump.  Dump(8) is *very* hosed.  The UFS2 import broke
>it's ability to follow multiple levels of indirect blocks.

Thanks for tracking this down! One thing is that the code was using
the static pointers to avoid having to malloc and free blocks every
time. Keeping an array of NIADDR pointers and using `ind_level' as
the index is an alternative (patch below) - I doubt the performance
difference is noticable but it avoids having to remember the free()
before each return.

I'll commit your printf format changes first anyway - thanks!

Ian

Index: traverse.c
===
RCS file: /dump/FreeBSD-CVS/src/sbin/dump/traverse.c,v
retrieving revision 1.19
diff -u -r1.19 traverse.c
--- traverse.c  21 Jun 2002 06:17:57 -  1.19
+++ traverse.c  7 Jul 2002 10:44:55 -
@@ -275,10 +275,13 @@
 {
int ret = 0;
int i;
-   static caddr_t idblk;
+   static caddr_t idblks[NIADDR];
+   caddr_t idblk;
 
-   if (idblk == NULL && (idblk = malloc(sblock->fs_bsize)) == NULL)
+   if (idblks[ind_level] == NULL &&
+   (idblks[ind_level] = malloc(sblock->fs_bsize)) == NULL)
quit("dirindir: cannot allocate indirect memory.\n");
+   idblk = idblks[ind_level];
bread(fsbtodb(sblock, blkno), idblk, (int)sblock->fs_bsize);
if (ind_level <= 0) {
for (i = 0; *filesize > 0 && i < NINDIR(sblock); i++) {
@@ -501,10 +505,13 @@
 dmpindir(ino_t ino, ufs2_daddr_t blk, int ind_level, off_t *size)
 {
int i, cnt;
-   static caddr_t idblk;
+   static caddr_t idblks[NIADDR];
+   caddr_t idblk;
 
-   if (idblk == NULL && (idblk = malloc(sblock->fs_bsize)) == NULL)
+   if (idblks[ind_level] == NULL &&
+   (idblks[ind_level] = malloc(sblock->fs_bsize)) == NULL)
quit("dmpindir: cannot allocate indirect memory.\n");
+   idblk = idblks[ind_level];
if (blk != 0)
bread(fsbtodb(sblock, blk), idblk, (int) sblock->fs_bsize);
else



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: additional queue macro

2002-07-02 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Jonathan Lemon writes:
>Essentially, this provides a traversal of the tailq that is safe 
>from element removal, while being simple to drop in to those sections
>of the code that need updating, as evidenced in the patch below.

Note that this of course is not "safe from element removal" in
general; it is just safe when you remove any element other than the
next element, whereas TAILQ_FOREACH is safe when you remove any
element other than the current one. For example it would not be
safe to call a callback that could potentially remove arbitrary
elements.

It may be clearer in this case just to expand the macro in the code
so that it is more obvious what assumptions can be made.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: KSE status report

2002-07-02 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, Ju
lian Elischer writes:
>The big problem at the moment is that something in the 
>source tree as a whole, and probably something that came in with KSE
>is stopping us from successfully compiling a working libc_r.
>(a bit ironic really).

Is the new

(elm)->field.tqe_next = (void *)-1;

in TAILQ_REMOVE a likely candidate? That could easily tickle old
bugs in other code. The libc_r code does use a lot of TAILQ macros.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: cvs commit: src/sys/i386/i386 pmap.c

2002-06-27 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, "Andrew R. Reite
r" writes:
>"Too many pages were prefaulted in pmap_object_init_pt, thus
> the wrong physical page was entered in the pmap for the virtual
> address where the .dynamic section was supposed to be."
>  
>  Submitted by:   tegge

Pointy hat to: iedowse

Sorry for the breakage, and thank's Tor for tracking it down. Somehow
my testing (mainly in a netbooted environment) didn't show this up,
and I failed to spot the bug even when I re-read the patches after
the breakage reports appeared yesterday.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: -current panic in suser_cred()

2002-06-24 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, Wesley M
organ writes:
>At some point between 20 Jun and (by my best guest) 22 Jun there has been
>a problem introduced somewhere... How much more vague can you get? :)...

>#12 0xc025dab5 in chkiq (ip=0xc3a5c400, change=4294967295, cred=0x0,
>flags=0)#13 0xc025b57f in ufs_inactive (ap=0xdb467be0)
>at ../../../ufs/ufs/ufs_inode.c:132

The UFS2 changes or something else probably broke quotas - try
removing the "options QUOTA" from your kernel config for now.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Typo in uma_core.c causing panics after uma_zdestroy()

2002-06-04 Thread Ian Dowse



The logic for testing UMA_ZFLAG_INTERNAL in zone_dtor() is reversed.
I was able to reliably reproduce crashes with:

mdconfig -a -t malloc -s 10m
mdconfig -d -u 0
mdconfig -a -t malloc -s 10m
mdconfig -d -u 0

Ian

Index: uma_core.c
===
RCS file: /FreeBSD/FreeBSD-CVS/src/sys/vm/uma_core.c,v
retrieving revision 1.26
diff -u -r1.26 uma_core.c
--- uma_core.c  3 Jun 2002 22:59:19 -   1.26
+++ uma_core.c  5 Jun 2002 01:17:27 -
@@ -1132,7 +1132,7 @@
printf("Zone %s was not empty.  Lost %d pages of memory.\n",
zone->uz_name, zone->uz_pages);
 
-   if ((zone->uz_flags & UMA_ZFLAG_INTERNAL) != 0)
+   if ((zone->uz_flags & UMA_ZFLAG_INTERNAL) == 0)
for (cpu = 0; cpu < maxcpu; cpu++)
CPU_LOCK_FINI(zone, cpu);
 

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: df

2002-05-18 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Terry Lambert writes:
>I think the reason for the "if" is to keep the df from hanging
>indefinitely, particularly when you give it an explicit list.

No, I believe the "if (vfslist != NULL)" code was there to reduce
the maximum column widths to those necessary for an explicit list
of filesystems rather than using the maximum widths for all
filesystems. The regetmntinfo() call before the "if" has already
performed any operations that could hang indefinitely, so it makes
sense to unconditionally use these up-to-date results instead of
the potentially stale list from getmntinfo(..., MNT_NOWAIT).

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: df

2002-05-18 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, Ian writes:
>Actually, I now think a more-correct fix would be to have no if statement at
>all, and just always recalculate the field widths after calling the routine
>to re-get the stats.

Yes, I agree. Committed, thanks!

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: cross-buildworld from i386 to alpha b rked...

2002-05-04 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, Poul-Henning Kamp writes:
>
>I tried "make buildworld TARGET_ARCH=alpha" and it croaked.  Is this
>expected breakage for a cross-build or genuine breakage ?

>/flat/src/usr.sbin/pstat/pstat.c:546: `NLOCKED' undeclared (first use in this 
>fu
>nction)

It's genuine breakage, caused by me, and fixed a few days ago. Try
updating again I guess. Sorry...

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Revision 1.88 of kern_linker.c breaks module loading for diskless

2002-04-25 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, Harti Brandt write
s:
>the check for rootdev != NODEV introduced in rev 1.88 breaks loading of
>kernel modules from an NFS mounted root in diskless configurations.
>Dropping in gdb and printing rootdev shows -1 which is, I assume, NODEV.

Ah, that would explain a problem I saw recently on a netbooted box
where kldload only worked with full module paths. Could `rootvnode'
be checked for NULL instead?

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: boot floppy problems...

2002-01-17 Thread Ian Dowse


In message , Mike Brancato wr
ites:
>no problem.
>keep up the good work.
>
>mike

Ok, it's fixed now. If you'd like to try it, there's an updated
version of the kern.flp from today's -CURRENT snapshot at:

http://www.maths.tcd.ie/~iedowse/FreeBSD/kern.flp

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: boot floppy problems...

2002-01-17 Thread Ian Dowse


In message , Mike Brancato wr
ites:
>oh, well.  They say something along the lines of
>"Disk error: lba is 0x9 (should be 0x10)"
>or similar.  then it trys to boot the kernel twice using the loader, but
>fails with the path 0:fd(0,a)/kernel

Hmm, the error is actually "Disk error 0x9 (lba=0x10)". I think
this is my fault. Error 9 is "data boundary error (attempted DMA
across 64K boundary or >80h sectors)", so by changing the buffers
to being static in revision 1.35 of boot2.c, I broke the guarantee
that single transfers don't cross a 64k boundary, which is important
for floppies :-( I'll fix this shortly. Thanks for pointing out the
problem!

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Netatalk broken in current? Lock order reversal?

2002-01-15 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, Emiel Kollof writes:
>
>Oh, on another note, is someone working at that netatalk breakage? Who
>do I have to discipline for that? :-)

Could you try the following patch in src/sys/netatalk? The problem
was caused by the -fno-common compiler option that was added to
the kernel build flags recently.

This compiles for me, but I haven't checked that it actually works.

Ian

Index: ddp_input.c
===
RCS file: /dump/FreeBSD-CVS/src/sys/netatalk/ddp_input.c,v
retrieving revision 1.12
diff -u -r1.12 ddp_input.c
--- ddp_input.c 13 Feb 2000 03:31:58 -  1.12
+++ ddp_input.c 16 Jan 2002 01:30:50 -
@@ -27,8 +27,6 @@
 static struct ddpstat  ddpstat;
 static struct routeforwro;
 
-const int atintrq1_present = 1, atintrq2_present = 1;
-
 static void ddp_input(struct mbuf *, struct ifnet *, struct elaphdr *, int);
 
 /*
Index: ddp_usrreq.c
===
RCS file: /dump/FreeBSD-CVS/src/sys/netatalk/ddp_usrreq.c,v
retrieving revision 1.22
diff -u -r1.22 ddp_usrreq.c
--- ddp_usrreq.c17 Nov 2001 03:07:08 -  1.22
+++ ddp_usrreq.c16 Jan 2002 01:32:34 -
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -547,6 +548,8 @@
 {
 atintrq1.ifq_maxlen = IFQ_MAXLEN;
 atintrq2.ifq_maxlen = IFQ_MAXLEN;
+atintrq1_present = 1;
+atintrq2_present = 1;
 mtx_init(&atintrq1.ifq_mtx, "at1_inq", MTX_DEF);
 mtx_init(&atintrq2.ifq_mtx, "at2_inq", MTX_DEF);
 }

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: mountd(8) leaving filesystems exported

2001-12-18 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Terry Lambert writes:
>> One nasty bug is that the code for un-exporting filesystems checks
>> to see if the filesystem is among a list of supported types, but
>> the code for exporting doesn't. This list of supported filesystems
>> does not include ext2fs or hpfs, so you can successfully export
>> these filesystems, but they remain exported even when the /etc/exports
>> entry is removed and mountd is restarted or sent a SIGHUP, and no
>> errors are logged...
>
>This is actually the wrong way to go about this.

I'll agree with this much anyway :-) Ignoring for now how the exports
are managed in the kernel, it is really bad that mountd needs to
know about individual filesystems in order to NFS export them. The
export interface also does not allow the export list to be replaced
atomically, so all of the exports fail briefly when mountd reloads them
on receipt of a SIGHUP.

There is apparently work ongoing to improving the mount(2) interface
(I forget who is doing this). Hopefully this should make it much
easier to arrange for mountd to change the export lists in a
filesystem-independent manner, even if exports are still managed
per-filesystem in the kernel.

However for this bug (ext2fs and hpfs filesystems cannot be un-exported
once they have been exported) I am just looking for a quick solution
for now, but I have already put some thought into improving the
mountd-kernel interface, which is something I really want to see
fixed.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

mountd(8) leaving filesystems exported

2001-12-14 Thread Ian Dowse



There are quite a few assumptions in mountd(8) about the layout of
the per-filesystem mount(2) `data' struct, which make the code quite
ugly. It uses a union to ensure that it supplies a large enough
structure to mount(2), but regardless of the filesystem type, it
always initialises the UFS version.

One nasty bug is that the code for un-exporting filesystems checks
to see if the filesystem is among a list of supported types, but
the code for exporting doesn't. This list of supported filesystems
does not include ext2fs or hpfs, so you can successfully export
these filesystems, but they remain exported even when the /etc/exports
entry is removed and mountd is restarted or sent a SIGHUP, and no
errors are logged...

The patch below should address this issue by checking the same list
of filesystems in both cases, and adding ext2fs and hpfs to the
filesystem list. It also avoids the need to assume that all xxx_args
have the export_args in the same place by storing the offsets in a
table. I am aware that there is work ongoing in the area of mount(2),
so maybe the patch is overkill at this time. Any comments?

Ian

Index: mountd.c
===
RCS file: /dump/FreeBSD-CVS/src/sbin/mountd/mountd.c,v
retrieving revision 1.59
diff -u -r1.59 mountd.c
--- mountd.c20 Sep 2001 02:15:17 -  1.59
+++ mountd.c15 Dec 2001 00:10:47 -
@@ -76,6 +76,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -157,6 +158,29 @@
nfsfh_t fhr_fh;
 };
 
+/* Union of mount(2) `data' structs for supported filesystems. */
+union mountdata {
+   struct ufs_args ua;
+   struct iso_args ia;
+   struct msdosfs_args da;
+   struct ntfs_args na;
+};
+
+/* Find the offset into the mountdata union of a filesystem's export_args. */
+struct ea_off {
+   char *fsname;
+   int exportargs_off;
+} ea_offtable[] = {
+   {"ufs", offsetof(union mountdata, ua.export)},
+   {"ifs", offsetof(union mountdata, ua.export)},
+   {"ext2fs", offsetof(union mountdata, ua.export)},
+   {"cd9660", offsetof(union mountdata, ia.export)},
+   {"msdosfs", offsetof(union mountdata, da.export)},
+   {"ntfs", offsetof(union mountdata, na.export)},
+   {"hpfs", offsetof(union mountdata, ua.export)}, /* XXX */
+   {NULL, 0}
+};
+
 /* Global defs */
 char   *add_expdir __P((struct dirlist **, char *, int));
 void   add_dlist __P((struct dirlist **, struct dirlist *,
@@ -191,6 +215,7 @@
 void   huphandler(int sig);
 intmakemask(struct sockaddr_storage *ssp, int bitlen);
 void   mntsrv __P((struct svc_req *, SVCXPRT *));
+struct export_args *mountdata_to_eap __P((union mountdata *, struct statfs *));
 void   nextfield __P((char **, char **));
 void   out_of_mem __P((void));
 void   parsecred __P((char *, struct xucred *));
@@ -884,6 +909,8 @@
 void
 get_exportlist()
 {
+   union mountdata args;
+   struct export_args *eap;
struct exportlist *ep, *ep2;
struct grouplist *grp, *tgrp;
struct exportlist **epp;
@@ -918,26 +945,16 @@
/*
 * And delete exports that are in the kernel for all local
 * file systems.
-* XXX: Should know how to handle all local exportable file systems
-*  instead of just "ufs".
 */
num = getmntinfo(&fsp, MNT_NOWAIT);
for (i = 0; i < num; i++) {
-   union {
-   struct ufs_args ua;
-   struct iso_args ia;
-   struct msdosfs_args da;
-   struct ntfs_args na;
-   } targs;
-
-   if (!strcmp(fsp->f_fstypename, "ufs") ||
-   !strcmp(fsp->f_fstypename, "msdosfs") ||
-   !strcmp(fsp->f_fstypename, "ntfs") ||
-   !strcmp(fsp->f_fstypename, "cd9660")) {
-   targs.ua.fspec = NULL;
-   targs.ua.export.ex_flags = MNT_DELEXPORT;
+   eap = mountdata_to_eap(&args, fsp);
+   if (eap != NULL) {
+   /* This is a filesystem that supports NFS exports. */
+   bzero(&args, sizeof(args));
+   eap->ex_flags = MNT_DELEXPORT;
if (mount(fsp->f_fstypename, fsp->f_mntonname,
-   fsp->f_flags | MNT_UPDATE, (caddr_t)&targs) < 0 &&
+   fsp->f_flags | MNT_UPDATE, &args) < 0 &&
errno != ENOENT)
syslog(LOG_ERR,
"can't delete exports for %s: %m",
@@ -1711,23 +1728,23 @@
int dirplen;
struct statfs *fsb;
 {
+   union mountdata args;
struct statfs fsb1;
struct addrinfo *ai;
struct export_args *eap;
char *cp = NULL;
int done;
char savedc = '\0';
-   union {
-   struct ufs_args ua;
-   struct i

Re: change to ZALLOC(9) man page

2001-12-14 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Ju
lian Elischer writes:
>By my reading of the code I would like to make the following changes
>to the documentation for the zone(9) man page;

Yes! Please do. I must have read that page about 10 times and been
annoyed at its misleading information, but I never got around to
fixing it. There's one spelling type s/mentionned/mentioned/ and
maybe "type stable" should be hyphenated.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: cvs commit: src/sys/kern subr_diskmbr.c

2001-12-10 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, Peter Wemm writes
:
>The problem is, that you **are** using fdisk tables, you have no choice.
>DD mode included a *broken* fdisk table that specified an illegal geometry.
...
>This is why it is called dangerous.

BTW, I presume you are aware of the way sysinstall creates DD MBRs;
it does not use the 5 sector slice 4 method, but sets up slice
1 to cover the entire disk including the MBR, with c/h/s entries
corresponding to the real start and end of the disk, e.g:

cylinders=3544 heads=191 sectors/track=53 (10123 blks/cyl)
...
The data for partition 1 is:
sysid 165,(FreeBSD/NetBSD/386BSD)
start 0, size 35885168 (17522 Meg), flag 80 (active)
beg: cyl 0/ head 0/ sector 1;
end: cyl 1023/ head 190/ sector 53
The data for partition 2 is:

The data for partition 3 is:

The data for partition 4 is:


Otherwise the disk layout is the same as disklabel's DD. I suspect
that this approach is much less illegal than disklabel's MBRs
although I do remember seeing a HP PC that disliked it. I wonder
if a reasonable compromise is to make disklabel use this system for
DD disks instead of the bogus 5 sector slice? Obviously, it
should also somehow not install a partition table unless boot1 is
being used as the MBR, and the fdisk -I method should be preferred.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: vmware fails on -current

2001-11-21 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, CHOI Junho writes:
>
>I'll try. Oh, I forget to say I appiled des's linux_ioctl patch.
>

Ah, that's different then. I assumed from the error that you had
revision 1.76 of linux_ioctl.c, but if that patch applied then you
don't. Try updating your sources again; revision 1.76 is des's
patch with a few problems fixed.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: vmware fails on -current

2001-11-21 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, CHOI Junho writes:
>
>Hmm.. I have experienced another problem(-current of 19 Nov.) with
>vmware. When it runs it comes up with the following dialog:
>
>  "Encountered an error while initializing the ethernet address.
>   You probably have an old vnet driver. Try installing a newer version
>   Failed to configure ethernet0"

Hi, could you try to get a ktrace of what it is doing just before
this happens? Run

ktrace -i vmware

as root (you may need to copy your ~/.vmware to ~root first). Then
use "linux_kdump -n" (/usr/ports/devel/linux_kdump) and look for
any ioctls that it does immediately before giving that error message.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: vmware fails on -current

2001-11-19 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, "Georg-W. Koltermann" writes
:
>I also tried to update /compat/linux/dev/vmnet1 to match the
>/dev/vmnet1, and that got me just a litte bit farther.  I now get
>"Could not get address for /dev/vmnet1: Invalid argument
>Failed to configure ethernet0." I added some printf's to
>linux_ioctl.c, and it seems the linux_ioctl_socket() gets a device
>name which is "", i.e. the empty string.

There was a discussion about this and workaround patches for RELENG_4
on -emulation. I'll try to organise with DES to get something
committed soon.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Multiple NFS server problems with Solaris 8 clients

2001-10-25 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, BSD User writes:
>Actually, upon instrumenting some code, it looks like RELEASE-4.4 gets it
>mostly right.  It ejects a PROG_UNAVAIL call which causes the Solaris 8
>client to back off.  The correct message would seem to be PROC_UNAVAIL,
>but I would take PROG_UNAVAIL if I could get -current to eject it.

I think PROG_UNAVAIL is correct; the packet trace that Thomas
provided shows an RPC request with a program ID of 100227 which is
not the NFS program ID.

Try the patch below. Peter's NFS revamp changed the semantics of
the nfsm_reply() macro, and nfsrv_noop() was not updated to match.
Previously nfsm_reply would set 'error' to 0 when nd->nd_flag did
not have ND_NFSV3 set, and much of the code that uses nfsrv_noop
to generate errors ensured that nd->nd_flag was zero. Now nfsm_reply
never sets 'error' to 0, so it needs to be done explicitly. Server
op functions must return 0 in order for a reply to be sent to the
client.

Ian

Index: nfs_serv.c
===
RCS file: /home/iedowse/CVS/src/sys/nfsserver/nfs_serv.c,v
retrieving revision 1.107
diff -u -r1.107 nfs_serv.c
--- nfs_serv.c  2001/09/28 04:37:08 1.107
+++ nfs_serv.c  2001/10/25 16:19:33
@@ -4000,6 +4000,7 @@
else
error = EPROCUNAVAIL;
nfsm_reply(0);
+   error = 0;
 nfsmout:
return (error);
 }

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Multiple NFS server problems with Solaris 8 clients

2001-10-14 Thread Ian Dowse


>
>The last one is a know problem. There is a (unfinished) patch available to
>solve this problem. Thomas Moestl <[EMAIL PROTECTED]> is still working on
>some issues of the patch. Please contact him if you like to know more.
>
>Here is the URL for the patch:
>
>http://home.teleport.ch/freebsd/userland/nfsd-loop.diff

That patch is a bit out of date, because Peter removed a big chunk
of kerberos code from nfsd since. I was actually just looking at
this problem again, so I include an updated version of Thomas's
patch below.

This version also removes entries from the children[] array when
a slave nfsd dies to avoid the possibility of accidentally killing
unrelated processes.

The issue that remains open with the patch is that currently if a
slave nfsd dies, then all nfsds will shut down. This is because
nfssvc() in the master nfsd returns 0 when the master nfsd receives
a SIGCHLD. This behaviour is probably reasonable enough, but the
way it happens is a bit odd.

Thomas, I'll probably commit this within the next few days if you
have no objections, and if you don't get there before me.  The
exiting behaviour can be resolved later if necessary.

Ian

Index: nfsd.c
===
RCS file: /dump/FreeBSD-CVS/src/sbin/nfsd/nfsd.c,v
retrieving revision 1.21
diff -u -r1.21 nfsd.c
--- nfsd.c  20 Sep 2001 02:18:06 -  1.21
+++ nfsd.c  14 Oct 2001 20:19:18 -
@@ -52,6 +52,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 #include 
@@ -64,6 +66,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -86,12 +89,16 @@
 intnfsdcnt;/* number of children */
 
 void   cleanup(int);
+void   child_cleanup(int);
 void   killchildren(void);
-void   nonfs (int);
-void   reapchild (int);
-intsetbindhost (struct addrinfo **ia, const char *bindhost, struct addrinfo 
hints);
-void   unregistration (void);
-void   usage (void);
+void   nfsd_exit(int);
+void   nonfs(int);
+void   reapchild(int);
+intsetbindhost(struct addrinfo **ia, const char *bindhost,
+   struct addrinfo hints);
+void   start_server(int);
+void   unregistration(void);
+void   usage(void);
 
 /*
  * Nfs server daemon mostly just a user context for nfssvc()
@@ -126,13 +133,12 @@
fd_set ready, sockbits;
fd_set v4bits, v6bits;
int ch, connect_type_cnt, i, len, maxsock, msgsock;
-   int nfssvc_flag, on = 1, unregister, reregister, sock;
+   int on = 1, unregister, reregister, sock;
int tcp6sock, ip6flag, tcpflag, tcpsock;
-   int udpflag, ecode, s;
-   int bindhostc = 0, bindanyflag, rpcbreg, rpcbregcnt;
+   int udpflag, ecode, s, srvcnt;
+   int bindhostc, bindanyflag, rpcbreg, rpcbregcnt;
char **bindhost = NULL;
pid_t pid;
-   int error;
 
if (modfind("nfsserver") < 0) {
/* Not present in kernel, try loading it */
@@ -141,8 +147,8 @@
}
 
nfsdcnt = DEFNFSDCNT;
-   unregister = reregister = tcpflag = 0;
-   bindanyflag = udpflag;
+   unregister = reregister = tcpflag = maxsock = 0;
+   bindanyflag = udpflag = connect_type_cnt = bindhostc = 0;
 #defineGETOPT  "ah:n:rdtu"
 #defineUSAGE   "[-ardtu] [-n num_servers] [-h bindip]"
while ((ch = getopt(argc, argv, GETOPT)) != -1)
@@ -313,8 +319,6 @@
daemon(0, 0);
(void)signal(SIGHUP, SIG_IGN);
(void)signal(SIGINT, SIG_IGN);
-   (void)signal(SIGSYS, nonfs);
-   (void)signal(SIGUSR1, cleanup);
/*
 * nfsd sits in the kernel most of the time.  It needs
 * to ignore SIGTERM/SIGQUIT in order to stay alive as long
@@ -324,40 +328,31 @@
(void)signal(SIGTERM, SIG_IGN);
(void)signal(SIGQUIT, SIG_IGN);
}
+   (void)signal(SIGSYS, nonfs);
(void)signal(SIGCHLD, reapchild);
 
-   openlog("nfsd:", LOG_PID, LOG_DAEMON);
+   openlog("nfsd", LOG_PID, LOG_DAEMON);
 
-   for (i = 0; i < nfsdcnt; i++) {
+   /* If we use UDP only, we start the last server below. */
+   srvcnt = tcpflag ? nfsdcnt : nfsdcnt - 1;
+   for (i = 0; i < srvcnt; i++) {
switch ((pid = fork())) {
case -1:
syslog(LOG_ERR, "fork: %m");
-   killchildren();
-   exit (1);
+   nfsd_exit(1);
case 0:
break;
default:
children[i] = pid;
continue;
}
-
+   (void)signal(SIGUSR1, child_cleanup);
setproctitle("server");
-   nfssvc_flag = NFSSVC_NFSD;
-   nsd.nsd_nfsd = NULL;
-   while (nfssvc(nfssvc_flag, &nsd) < 0) {
-   if (errno) {
-   syslog(LOG_ERR, "n

Missing stack frames in kgdb/ddb traces

2001-10-07 Thread Ian Dowse



I noticed recently two problems with gdb/ddb traces that involve an
interrupt frame (both of these are in i386-specific code, but maybe
similar issues exist on other architectures):

The first is that kgdb sometimes messes up a stack frame that
includes an interrupt, e.g in the trace below, the cpu_idle() frame
is corrupted.

#7  0xc0325246 in siointr1 (com=0xc092a400) at machine/cpufunc.h:63
#8  0xc0325137 in siointr (arg=0xc092a400) at ../../../isa/sio.c:1859
#9  0x8 in ?? ()
#10 0xc01ff391 in idle_proc (dummy=0x0) at ../../../kern/kern_idle.c:99
#11 0xc01ff210 in fork_exit (callout=0xc01ff370 , arg=0x0, 
frame=0xc40ffd48) at ../../../kern/kern_fork.c:785

This is because gdb was never updated when cpl was removed from the
interrupt frame (ddb was changed in i386/i386/db_trace.c rev 1.37).
The following patch seems to fix it:

Index: gnu/usr.bin/binutils/gdb/i386/kvm-fbsd.c
===
RCS file: /dump/FreeBSD-CVS/src/gnu/usr.bin/binutils/gdb/i386/kvm-fbsd.c,v
retrieving revision 1.27
diff -u -r1.27 kvm-fbsd.c
--- gnu/usr.bin/binutils/gdb/i386/kvm-fbsd.c19 Sep 2001 18:42:19 -  1.27
+++ gnu/usr.bin/binutils/gdb/i386/kvm-fbsd.c7 Oct 2001 19:45:28 -
@@ -176,7 +176,7 @@
return (read_memory_integer (fr->frame + 8 + oEIP, 4));
 
case tf_interrupt:
-   return (read_memory_integer (fr->frame + 16 + oEIP, 4));
+   return (read_memory_integer (fr->frame + 12 + oEIP, 4));
 
case tf_syscall:
return (read_memory_integer (fr->frame + 8 + oEIP, 4));


Secondly, fast interrupts do not have an XresumeN style of symbol,
so neither gdb nor ddb treat their frames as interrupt frames.
This causes the frame listed as XfastintrN to gobble up the frame
that was executing at the time of the interrupt, which is especially
annoying when a serial console is being used to debug an infinite
loop in the kernel.

The following patch adds an XresumefastN to fast interrupt handlers,
which allows gdb and ddb to correctly see the missing frame. The
name Xresumefast is chosen because it involves no ddb or gdb changes
(they just check for a name beginning with "Xresume").

Any comments?

Ian

Index: sys/i386/isa/icu_vector.s
===
RCS file: /dump/FreeBSD-CVS/src/sys/i386/isa/icu_vector.s,v
retrieving revision 1.29
diff -u -r1.29 icu_vector.s
--- sys/i386/isa/icu_vector.s   12 Sep 2001 08:37:34 -  1.29
+++ sys/i386/isa/icu_vector.s   7 Oct 2001 19:48:06 -
@@ -60,6 +60,7 @@
mov %ax,%es ; \
mov $KPSEL,%ax ; \
mov %ax,%fs ; \
+__CONCAT(Xresumefast,irq_num): ; \
FAKE_MCOUNT((12+ACTUALLY_PUSHED)*4(%esp)) ; \
movlPCPU(CURTHREAD),%ebx ; \
inclTD_INTR_NESTING_LEVEL(%ebx) ; \



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

modules/ed/../../dev/if_ed.c:40: opt_ed.h: No such file or directory

2001-09-29 Thread Ian Dowse



Apologies for this - I missed out a file in a commit earlier. Fixed
now. Any other (non-module) complaints about opt_ed.h can be cured
by rerunning config.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: mdmfs mount_mfs compatibility bug?

2001-09-29 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Dima Dorfman writes:
>
>The problem with this is that in a bikeshed far, far in the past, some
>people wanted to me able to call it "mount_md" instead of "mount_mfs".
>Of course, we could allow "mfs" and "md", but that seems rather ugly
>(what if someone wants "fish"?).  I'd rather see mount(8) use
>mount_xxx, although if we think that would break something, your patch
>is probably the best solution.

I can't think of any good reason not to change mount(8), but I also
think that mdmfs only needs to support the weird mount_mfs defaults
when invoked with a name of "mount_mfs" or "mfs". People can call
it mount_fish if they like and it will work fine, just with the
mdmfs rather than mount_mfs defaults. The non-compatibility defaults
are better defaults anyway, so they should probably be used in all
cases except those that are necessary for compatibility with
mount_mfs.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: mdmfs mount_mfs compatibility bug?

2001-09-29 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, Jos Backus writes:
>> 
>> This was fixed some time ago, I thought.  Are you up to date?
>
>There was a commit to mdmfs.c in August.
>This is with yesterday's -current, sorry, should have mentioned that.

The "mount -t mfs" case doesn't work with mdmfs, because mount(8)
uses the filesystem name, not the mount_xxx program as argv[0]. I
had guessed this would be a problem when I read the commit message
for revision 1.7 of mdmfs.c, but then I forgot to mention it to
Dima.

Here is a patch that should help - it makes mdmfs accept "mount_mfs"
or "mfs" to trigger compatibility mode instead of mount_*.

Ian

Index: mdmfs.8
===
RCS file: /dump/FreeBSD-CVS/src/sbin/mdmfs/mdmfs.8,v
retrieving revision 1.8
diff -u -r1.8 mdmfs.8
--- mdmfs.8 16 Aug 2001 07:43:16 -  1.8
+++ mdmfs.8 29 Sep 2001 23:50:29 -
@@ -304,9 +304,10 @@
 flag,
 or by starting
 .Nm
-with
-.Li mount_
-at the beginning of its name
+with the name
+.Li mount_mfs
+or
+.Li mfs
 (as returned by
 .Xr getprogname 3 ) .
 In this mode, only the options which would be accepted by
Index: mdmfs.c
===
RCS file: /dump/FreeBSD-CVS/src/sbin/mdmfs/mdmfs.c,v
retrieving revision 1.7
diff -u -r1.7 mdmfs.c
--- mdmfs.c 16 Aug 2001 02:40:29 -  1.7
+++ mdmfs.c 29 Sep 2001 22:58:05 -
@@ -116,8 +116,9 @@
newfs_arg = strdup("");
mount_arg = strdup("");
 
-   /* If we were started as mount_*, imply -C. */
-   if (strncmp(getprogname(), "mount_", 6) == 0)
+   /* If we were started as mount_mfs or mfs, imply -C. */
+   if (strcmp(getprogname(), "mount_mfs") == 0 ||
+   strcmp(getprogname(), "mfs") == 0)
compat = true;
 
while ((ch = getopt(argc, argv,




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: panic on mount

2001-09-25 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, John Baldwin writes:
>
>It looks like the mutex is really held since the mtx_assert before
>witness_unlock didn't trigger.  You can try turning witness off for the time
>being as a workaround.  I'm not sure why witness would be broken, however.

Revision 1.41 of sys/mutex.h seems to be the culprit. Before 1.41,
the defined(LOCK_DEBUG) and !defined(LOCK_DEBUG) cases were identical
except that with LOCK_DEBUG defined, the function versions of
_mtx_*lock_* were used. After 1.41, the !defined(LOCK_DEBUG) case
misses all the MPASS/KASSERT/LOCK_LOG/WITNESS bits.

A simple workaround that seems to stop the panics is below.

Ian

Index: mutex.h
===
RCS file: /dump/FreeBSD-CVS/src/sys/sys/mutex.h,v
retrieving revision 1.41
diff -u -r1.41 mutex.h
--- mutex.h 2001/09/22 21:19:55 1.41
+++ mutex.h 2001/09/26 00:46:09
@@ -238,7 +238,7 @@
 #define mtx_unlock(m)  mtx_unlock_flags((m), 0)
 #define mtx_unlock_spin(m) mtx_unlock_spin_flags((m), 0)

-#ifdef LOCK_DEBUG
+#if 1
 #definemtx_lock_flags(m, opts) \
_mtx_lock_flags((m), (opts), LOCK_FILE, LOCK_LINE)
 #definemtx_unlock_flags(m, opts)   \

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Panic with latest current/UFS_DIRHASH

2001-08-21 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, Ollivier Robert writes:
>According to Ollivier Robert:
>> Just upgraded my laptop to the latest current and during installworld, got
>> this panic:
>> 
>> panic: ufsdirhash_findslot: 'ka_JP.Shift_JIS' not found

Thanks for the bug report - see my other mail to -current for
further details, but the quick answer is that dirhash has a bug
that is triggered by the odd directory entries that fsck sometimes
leaves behind. This short patch should fix it:

Ian

Index: ufs_lookup.c
===
RCS file: /FreeBSD/FreeBSD-CVS/src/sys/ufs/ufs/ufs_lookup.c,v
retrieving revision 1.52
diff -u -r1.52 ufs_lookup.c
--- ufs_lookup.c2001/08/18 03:08:48 1.52
+++ ufs_lookup.c2001/08/22 00:27:17
@@ -884,7 +884,7 @@
dsize = DIRSIZ(OFSFMT(dvp), nep);
spacefree += nep->d_reclen - dsize;
 #ifdef UFS_DIRHASH
-   if (dp->i_dirhash != NULL)
+   if (dp->i_dirhash != NULL && nep->d_ino)
ufsdirhash_move(dp, nep, dp->i_offset + loc,
dp->i_offset + ((char *)ep - dirbuf));
 #endif



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

fsck setting d_ino == 0 (was Re: filesystem errors)

2001-08-21 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, Kirk McKusick writ
es:
>FFS will never set a directory ino == 0 at a location other
>than the first entry in a directory, but fsck will do so to
>get rid of an unwanted entry. The readdir routines know to
>skip over an ino == 0 entry no matter where in the directory
>it is found, so applications will never see such entries.
>It would be a fair amount of work to change fsck to `do the
>right thing', as the checking code is given only the current
>entry with which to work. I am of the opinion that you
>should simply accept that mid-directory block ino == 0 is
>acceptable rather than trying to `fix' the problem.

Bleh, well I guess not too surprisingly, there is a case in
ufs_direnter() (ufs_lookup.c) where the kernel does the wrong thing
when a mid-block entry has d_ino == 0. The result can be serious
directory corruption, and the bug has been there since the Lite/2
merge:

# fetch http://www.maths.tcd.ie/~iedowse/FreeBSD/dirbug_img.gz
Receiving dirbug_img.gz (6745 bytes): 100%
6745 bytes transferred in 0.0 seconds (4.69 MBps)
# gunzip dirbug_img.gz
# mdconfig -a -t vnode -f dirbug_img
md0
# fsck_ffs /dev/md0
** /dev/md0
** Last Mounted on /mnt
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
20 files, 1 used, 2638 free (14 frags, 328 blocks, 0.5% fragmentation)
# mount /dev/md0 /mnt
# touch /mnt/ff12
# umount /mnt
# fsck_ffs /dev/md0
** /dev/md0
** Last Mounted on /mnt
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
DIRECTORY CORRUPTED  I=2  OWNER=root MODE=40755
SIZE=512 MTIME=Aug 21 22:28 2001 
DIR=/

SALVAGE? [yn]

The bug is that when compressing directory blocks, the code trusts
the DIRSIZ() macro to calculate the amount of data to be bcopy'd
when moving a directory entry. If d_ino is zero, DIRSIZ() cannot
be trusted, so random bytes in unused portions of the directory
determine how much gets copied. I think it is very unlikely in
practice for the value returned by DIRSIZ() to be harmful, but fsck
certainly doesn't check it so this bug can be triggered after other
types of corruption have been repaired by fsck.

I just found this while looking for a dirhash bug - the dirhash
code didn't check for d_ino == 0 when compressing directories,
so it would freak when it couldn't find the entry to move. The
patch below should fix both these issues, and it makes it clearer
that DIRSIZ() is not used when d_ino == 0.

Any comments welcome. The patch is a bit larger than it needs to
be, but that directory compression code is so hard to understand
that I think it is worth clarifying it slightly :-)

Ian


Index: ufs_lookup.c
===
RCS file: /FreeBSD/FreeBSD-CVS/src/sys/ufs/ufs/ufs_lookup.c,v
retrieving revision 1.52
diff -u -r1.52 ufs_lookup.c
--- ufs_lookup.c2001/08/18 03:08:48 1.52
+++ ufs_lookup.c2001/08/21 23:59:09
@@ -869,26 +869,38 @@
 * dp->i_offset + dp->i_count would yield the space.
 */
ep = (struct direct *)dirbuf;
-   dsize = DIRSIZ(OFSFMT(dvp), ep);
+   dsize = ep->d_ino ? DIRSIZ(OFSFMT(dvp), ep) : 0;
spacefree = ep->d_reclen - dsize;
for (loc = ep->d_reclen; loc < dp->i_count; ) {
nep = (struct direct *)(dirbuf + loc);
-   if (ep->d_ino) {
-   /* trim the existing slot */
-   ep->d_reclen = dsize;
-   ep = (struct direct *)((char *)ep + dsize);
-   } else {
-   /* overwrite; nothing there; header is ours */
-   spacefree += dsize;
+
+   /* Trim the existing slot (NB: dsize may be zero). */
+   ep->d_reclen = dsize;
+   ep = (struct direct *)((char *)ep + dsize);
+
+   loc += nep->d_reclen;
+   if (nep->d_ino == 0) {
+   /*
+* A mid-block unused entry. Such entries are
+* never created by the kernel, but fsck_ffs
+* can create them (and it doesn't fix them).
+*
+* Add up the free space, and initialise the
+* relocated entry since we don't bcopy it.
+*/
+   spacefree += nep->d_reclen;
+   ep->d_ino = 0;
+   dsize = 0;
+   continue;
}
dsize = DIRSIZ(OFSFMT(dvp), nep);
spacefree += nep->d_reclen - dsize;
 #ifdef UFS_DIRHASH
if (dp->i_dirhash != NULL)
-   uf

Re: Panic with latest current/UFS_DIRHASH

2001-08-21 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Ollivier Robert writes:
>
>The interesting thing is that I also get that with my old 17th Jul.
>kernel... except that the panic message is 
>
>"ufsdirhash_checkblock: bad dir inode"
>
>It is always in the following part of installworld:

That's interesting - the "bad dir inode" bit in particular. I'll
look into this in more detail later. My first guess is that there
is a logic flaw in the dirhash code that only triggers when dirhash
comes across a directory entry that has had its inode zeroed by
fsck.

The kernel filsystem code only ever places unused directory entries
at the start of a directory block (free space that is not at the
start of a block is merged into an exesting entry). However, fsck
can mark any entry as unused, resulting in the unfortunate situation
that fsck can put the filesystem into a state that cannot be produced
by any combination of kernel filesystem operations. That introduces
quite some potential for obscure bugs that only occur after an fsck
run...

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Panic on 8/10 -current: sleeping process owns a mutex

2001-08-21 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Doug Barton writes:
>Immediately prior to the crash I was getting a lot of these on the console:
>
>Aug 12 01:00:52  Master /boot/kernel/kernel:
>/usr/local/src/sys/kern/kern_synch.c:377: sleeping with "mountlist" locke
>d from /usr/local/src/sys/kern/vfs_syscalls.c:548

This should be fixed by revision 1.198 of vfs_syscalls.c. It could
only occur during unmount(), which is why it didn't show up more
often:

iedowse 2001/08/20 12:16:31 PDT

  Modified files:
sys/kern vfs_syscalls.c 
  Log:
  Avoid sleeping while holding a mutex in dounmount(). This problem
  has existed for a long time, but I made it worse a few months ago
  by by adding calls to VFS_ROOT() and checkdirs() in revision 1.179.

  Also, remove the LK_REENABLE flag in the lockmgr() call; this flag
  has been ignored by the lockmgr code for 4 years. This was the only
  remaining mention of it apart from its definition.

  Reviewed by:  jhb

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: rpc.umtall dumps core on each startup/shutdown

2001-08-03 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Maxim Sobolev writes:
>I found that the rpc.umntall program from time to time starts dumping a core
>at each startup/shutdown. Removal of /var/db/mountab helps for certain

It seems to be a bug in the rpc library (thank $deity for libefence
when tracking down such bugs :-). The rpcbind client code in libc
keeps a cache of DNS lookups, but it is missing a strdup() when it
copies a string from the cache.

Investigating this has shown up a few bugs I introduced to rpc.umtall
in my last set of changes, so I'll fix those too.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: filesystem errors

2001-07-28 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Michael Harnois writes:
>
>I don't have sufficient technical knowledge to know which of you is
>right; I would just ask that filesystem corruption caused by
>restarting from a hung system not cause a panic .

I removed the extra sanity check yesterday, so if you have revision
1.3 of ufs_dirhash.c you won't see that panic again. I didn't
realise that fsck actually causes these directory entries, but just
the fact that it leaves them intact meant that the sanity check
was bad.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

SIGCHLD changes causing fault on nofault entry panics

2001-07-27 Thread Ian Dowse



The panics in exit1() that have been reported on -stable appear to
be caused by these commits:

REV:1.92.2.4kern_exit.c 2001/07/25 17:21:46   dillon
REV:1.72.2.7kern_sig.c  2001/07/25 17:21:46   dillon

   MFC kern_exit.c 1.131, kern_sig.c 1.125 - bring SIGCHLD SIG_IGN signal
   handling in line with other operating systems.

These probably correspond to similar panics seen in -current, but I
haven't checked the details.

In the vmcore I just got, the panic occurred in the following
fragment in exit1(), when dereferencing p_sigacts (which is
p_procsig->ps_sigacts). I guess there is a race here if the parent
is exiting or something?

+   if ((p->p_pptr->p_procsig->ps_flag & PS_NOCLDWAIT)
+   || p->p_pptr->p_sigacts->ps_sigact[_SIG_IDX(SIGCHLD)] == SIG_IGN) {

Matt, I will just back out these changes from RELENG_4 shortly
until the issue is resolved. The change was non-essential and quite
contained, so it's probably better than waiting for a fix.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: filesystem errors

2001-07-26 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Michael Harnois writes:
>
>The only result it generated was
>
>/usr/home/mdharnois off 120 ino 0 reclen 0x188 type 010 namelen 14 name '.fetc
>hmail.pid' [368]
>
>and that file is destroted and recreated every couple of minutes.

It's the directory (/usr/home/mdharnois), not the file that is the
problem. If you recreate the directory:

cd /usr/home
mv mdharnois mdharnois.old
mkdir mdharnois
chown mdharnois:mdharnois mdharnois # (or whatever)
mv mdharnois.old/* mdharnois/
mv mdharnois.old/.[a-zA-Z0-9]* mdharnois/
rmdir mdharnois.old

this problem should go away permanently. Even just creating loads of
files in the existing directory might be enough to reuse the bit of
the directory that has d_ino == 0. Running

./dircheck.pl /usr/home/mdharnois

will check if there is still a problem.

However, I'd like to know if this is something that fsck should
detect and correct automatically. It is an odd case, because the
ffs filesystem code never creates directory entries like this, but
I think it will not object to them if it finds them. This kind of
ambiguity is probably a bad thing.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: filesystem errors

2001-07-26 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Michael Harnois writes:
>I'm tearing my hair out trying to find a filesystem error that's
>causing me a panic: ufsdirhash_checkblock: bad dir inode.
>
>When I run fsck from a single user boot, it finds no errors.
>
>When I run it on the same filesystem mounted, it finds errors: but, of
>course, it then can't correct them

[Kirk, I'm cc'ing you because here the dirhash code sanity checks
found a directory entry with d_ino == 0 that was not at the start
of a DIRBLKSIZ block. This doesn't happen normally, but it seems
from this report that fsck does not correct this. Is it a basic
filesystem assumption that d_ino == 0 can only happen at the start
of a directory block, or is it something the code should tolerate?]

Interesting - this is an error reported by the UFS_DIRHASH code
that you enabled in your kernel config. A sanity check that the
dirhash code is performing is failing. These checks are designed
to catch bugs in the dirhash code, but in this case I think it may
be a bug that fsck is not finding this problem, or else my sanity
tests are too strict.

A workaround is to turn off the sanity checks with:

sysctl vfs.ufs.dirhash_docheck=0

or to remove UFS_DIRHASH from your kernel config. You could also
try to find the directory that is causing the problems. Copy the
following script to a file called dircheck.pl, and try running:

chmod 755 dircheck.pl
find / -fstype ufs -type d -print0 | xargs ./dircheck.pl

That should show up any directories that would fail that dirhash
sanity check - there will probably just be one or two that resulted
from some old filesystem corruption.

Ian

#!/usr/local/bin/perl

while (defined($dir = shift)) {
unless (open(DIR, "$dir")) {
print STDERR "$dir: $!\n";
next;
}

$b = 0;
my(%dir) = ();

while (sysread(DIR, $dat, 512) == 512) {
$off = 0;
while (length($dat) > 0) {
($dir{'d_ino'}, $dir{'d_reclen'}, $dir{'d_type'},
$dir{'d_namlen'}) = unpack("LSCC", $dat);
$dir{'d_name'} = substr($dat, 8, $dir{'d_namlen'});
$minreclen = (8 + $dir{'d_namlen'} + 1 + 3) & (~3);
$gapinfo = ($dir{'d_reclen'} == $minreclen) ? "" :
sprintf("[%d]", $dir{'d_reclen'} - $minreclen);

if ($dir{'d_ino'} == 0 && $off != 0) {
printf("%s off %d ino %d reclen 0x%x type 0%o"
. " namelen %d name '%s' %s\n",
$dir, $off, $dir{'d_ino'},
$dir{'d_reclen'}, $dir{'d_type'},
$dir{'d_namlen'}, $dir{'d_name'},
$gapinfo);
}
if ($dir{'d_reclen'} > length($dat)) {
die "reclen too long!\n";
}
$dat = substr($dat, $dir{'d_reclen'});
$off += $dir{'d_reclen'};
}
$b++;
}
}

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: NFS client unable to recover from server crash

2001-07-23 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Matt Dillon writes:
>   Ian, please don't do this.  The whole point of having an uninterruptable
>   mount is so the client can survive a server reboot or network failure.
>   Doing this destroys uninterruptable semantics.

Firstly, I have no intention of committing this patch anytime soon,
but I think you are mistaken in what it does. Unless I messed up,
it will never allow "uninterruptible" mounts to be interrupted by
signals or to time out. I set the socket timeout to 10 seconds,
but that will have no effect because the code will simply loop
around and retry again. It is nfs_sigintr() that detects signals,
and it returns immediately unless the NFSMNT_INT mount flag is set.

Similarly, the request only times out if rep->r_rexmit >= r_retry,
but unless it is a "soft" nfs mount, r_rexmit is clamped at
NFS_MAXREXMIT, and r_retry is set to NFS_MAXREXMIT + 1, so this
can never happen.

The only effect of changing that timeout value (again assuming I
have not misread the code) is to allow any request that does get
marked R_SOFTTERM to time out within a finite period. For hard
mounts, the _only_ way that this can happen is via the new
nfs_nmcancelreqs() which is called when you do a forced unmount.

No, I haven't gone mad and decided to make all NFS mounts soft
to "fix" all NFS problems :-)

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: NFS client unable to recover from server crash

2001-07-23 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, Maxim Sobolev writes:
>I found that after introduction of the new RPC NFS client is no longer
>able to recover from server crash (both cluent and server are 5-CURRENT
>systems). After a well known `nfs server not responding' message, client
>hangs and even though server comes back in a minute or two it doesn't
>recover and just sits in this state forvewer. All unmount requests gets
>stuck in the kernel, so as a processes that accessing files from that
>mount point. This doesn't looks like a right thing and obviously should
>be fixed before 5.0-RELEASE.

I've seen some similar effects, but I don't think it has anything
to do with the new RPC code, as that only runs at mount time. It
would be useful if you could use tcpdump to see if any requests
are being transmitted, and if they are getting responses. Also
try running kdgb on the client to get a kernel backtrace of the
stuck processes.

Is this a UDP or TCP based mount?

If you are feeling brave, you could also try the patch below. It
is a selection of changes to the kernel NFS code that I have built
up over the last few months. I don't think it could solve the hangs,
but it should improve the chance of interruptible mounts accepting
^C while waiting, and (just added the other day) umount -f should
work while the server is down even if processes are hung.

Ian


Index: nfs.h
===
RCS file: /dump/FreeBSD-CVS/src/sys/nfs/nfs.h,v
retrieving revision 1.59
diff -u -r1.59 nfs.h
--- nfs.h   2001/04/17 20:45:21 1.59
+++ nfs.h   2001/07/20 13:19:51
@@ -633,6 +633,7 @@
  struct mbuf *));
 intnfs_adv __P((struct mbuf **, caddr_t *, int, int));
 void   nfs_nhinit __P((void));
+void   nfs_nmcancelreqs __P((struct nfsmount *));
 void   nfs_timer __P((void*));
 intnfsrv_dorec __P((struct nfssvc_sock *, struct nfsd *, 
 struct nfsrv_descript **));
Index: nfs_nqlease.c
===
RCS file: /dump/FreeBSD-CVS/src/sys/nfs/nfs_nqlease.c,v
retrieving revision 1.59
diff -u -r1.59 nfs_nqlease.c
--- nfs_nqlease.c   2001/05/01 08:13:14 1.59
+++ nfs_nqlease.c   2001/05/01 14:29:22
@@ -952,7 +952,9 @@
 }
 
 /*
- * Called for client side callbacks
+ * Called for client side callbacks.
+ * NB: We are responsible for freeing `mrep' in all cases, but note
+ * that anything that does a 'goto nfsmout' frees it for us.
  */
 int
 nqnfs_callback(nmp, mrep, md, dpos)
@@ -982,8 +984,10 @@
nfsd->nd_md = md;
nfsd->nd_dpos = dpos;
error = nfs_getreq(nfsd, &tnfsd, FALSE);
-   if (error)
+   if (error) {
+   m_freem(mrep);
return (error);
+   }
md = nfsd->nd_md;
dpos = nfsd->nd_dpos;
if (nfsd->nd_procnum != NQNFSPROC_EVICTED) {
Index: nfs_socket.c
===
RCS file: /dump/FreeBSD-CVS/src/sys/nfs/nfs_socket.c,v
retrieving revision 1.66
diff -u -r1.66 nfs_socket.c
--- nfs_socket.c2001/05/01 08:13:14 1.66
+++ nfs_socket.c2001/07/20 13:45:01
@@ -144,7 +144,8 @@
  */
 #defineNFS_CWNDSCALE   256
 #defineNFS_MAXCWND (NFS_CWNDSCALE * 32)
-static int nfs_backoff[8] = { 2, 4, 8, 16, 32, 64, 128, 256, };
+#define NFS_NBACKOFF   8
+static int nfs_backoff[NFS_NBACKOFF] = { 2, 4, 8, 16, 32, 64, 128, 256, };
 int nfsrtton = 0;
 struct nfsrtt nfsrtt;
 struct callout_handle  nfs_timer_handle;
@@ -299,11 +300,17 @@
splx(s);
}
if (nmp->nm_flag & (NFSMNT_SOFT | NFSMNT_INT)) {
-   so->so_rcv.sb_timeo = (5 * hz);
-   so->so_snd.sb_timeo = (5 * hz);
+   so->so_rcv.sb_timeo = (2 * hz);
+   so->so_snd.sb_timeo = (2 * hz);
} else {
-   so->so_rcv.sb_timeo = 0;
-   so->so_snd.sb_timeo = 0;
+   /*
+* We would normally set the timeouts to 0 (never time out)
+* for non-interruptible mounts. However, nfs_nmcancelreqs()
+* can still prematurely terminate requests, so avoid
+* waiting forever.
+*/
+   so->so_rcv.sb_timeo = 10 * hz;
+   so->so_snd.sb_timeo = 10 * hz;
}
 
/*
@@ -1400,10 +1407,18 @@
for (rep = nfs_reqq.tqh_first; rep != 0; rep = rep->r_chain.tqe_next) {
nmp = rep->r_nmp;
if (rep->r_mrep || (rep->r_flags & R_SOFTTERM))
-   continue;
-   if (nfs_sigintr(nmp, rep, rep->r_procp)) {
-   nfs_softterm(rep);
continue;
+   /*
+* Test for signals on interruptible mounts. We try to
+* maintain normal (uninterruptible) semantics while the
+* server is up, but respond quickly to signals when it
+

Re: Load average synchronisation and phantom loads

2001-07-18 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Bruce Ev
ans writes:
>On Tue, 17 Jul 2001, Ian Dowse wrote:
>> effect in the load calculation, but even for the shorter 5-minute
>> timescale, this will average out to typically no more than a few
>> percent (i.e the "5 minutes" will instead normally be approx 4.8
>> to 5.2 minutes). Apart from a marginally more wobbley xload display,
>> this should not make any real difference.
>
>It should average out to precisely the same as before if you haven't
>changed the mean (mean = average :-).  The real difference may be
>small, but I think it is an unnecessary regression.

I meant the "5-minute average" that is computed; it will certainly
not be precicely the same as before, though it will be similar.

>from 0 very fast.  Even with a large variation, the drift might not be
>fast enough.

Actually, it's not too bad with a +-1 second variation, which is
why I chose a value that large. If you plot 60 samples (60 is the
number of 5-second intervals in the 5-minute load average timescale)
you get a relatively good dispersion of points throughout the
5-second interval. Try pasting the following into gnuplot a few
times:

 plot [] [-2.5:2.5] \
 "1){}select(undef,undef,undef,2.5)}'

(5-second period, 50% duty cycle) then the interference pattern
resulting from a +-1 tick variation has a period that is typically
days long! Of course the interference pattern caused by the above
script has an infinitely long period with the old load average
calculation; it always causes an additional load of 1.0 even though
the %CPU usage is approx 50%.

>> The alternative that I considered was to sample the processes once
>> during every 5-second interval, but to place the sampling point
>> randomly over the interval. That probably results in a better
>
>I rather like this.  With immediate update, It's almost equivalent to
>your current method with a random variation of between -5 and 5 seconds
>instead of between -1 and 1 seconds.  Your current method doesn't
>really reduce the jitter -- it just concentrates it into a smaller
>interval.

When I tried this approach (with immediate update), I didn't like
the jumpyness of the load average. Instead of the relatively smooth
decay that I'm used to, the way it sometimes changed twice in short
succession and sometimes did not change for nearly 10 seconds was
quite noticable. I'd be quite happy to go with the delayed version
of this, though it does mean having two timer routines, and storing
the `nrun' somewhere between samples and updates.

>hopefully rare.  Use a (small) random variation to reduce phase effects
>for such processes.  I think there are none in the kernel.  I would try
>using the following magic numbers:
>
>sample interval = 5.02 seconds (approx) (not 5.01, so that the random
> variation never gives a multiple
> of 1.00)
>random variation = 0+-0.01 seconds (approx)
>cexp[] = adjusted for 5.02 instead of 5.00

See above. I really want to try and avoid _any_ significant
synchronisation effects, not just those that are caused by the
kernel or by applications that happen to have a run pattern with
a period of N * 1.0 seconds.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Load average synchronisation and phantom loads

2001-07-17 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Bruce Ev
ans writes:
>
>I think that is far too much variation.  5 seconds is hard-coded into
>the computation of the load average (constants in cexp[]), so even a
>variation of +-1 ticks breaks the computation slightly.

I have not changed the mean inter-sample time from 5 seconds (*),
so is this really a problem? There will be a slight time-warping
effect in the load calculation, but even for the shorter 5-minute
timescale, this will average out to typically no more than a few
percent (i.e the "5 minutes" will instead normally be approx 4.8
to 5.2 minutes). Apart from a marginally more wobbley xload display,
this should not make any real difference.

If the variation was much smaller than it is in the proposed patch,
you could get a noticable drifting in and out of phase with processes
that have a regular run-pause pattern. Obviously this is a much
bigger problem when the sample period is fixed like it is now, but
I wanted to minimise the possibility of this effect while keeping
the inter-update time "relatively" constant.

The alternative that I considered was to sample the processes once
during every 5-second interval, but to place the sampling point
randomly over the interval. That probably results in a better
synchronisation-avoidance behaviour. However, to incorporate the
sample into the load average requires either waiting until the end
of the interval, or updating the load average at the time of
sampling. The former introduces a new delay into the load average
computation, and the latter results in a lot of very noticable
jitter on the inter-sample interval.

(*) Actually, I have changed the mean by 0.5 ticks, but that is a
bug that I will fix. The "4 + random() % (hz * 2)" should be "4 +
random() % (hz * 2 + 1)" instead.

>Not another SYSINIT (all SYSINITs are evil IMO).  SI_SUB_PSEUDO is
>bogus here -- there are no pseudo ttys here.  sched_setup() is a
>good place to do this initialization.

John Baldwin suggested moving the load average calculation into
kern_synch.c, so it would certainly make sense to initialise it
from sched_setup() then. This seems like a good idea to me; does
that sound OK?

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Load average synchronisation and phantom loads

2001-07-15 Thread Ian Dowse



There are a few PRs and a number of messages in the mailing list
archives that describe a problem where the load average occasionally
remains at 1.0 or greater even though top(1) reports that the CPU
is nearly 100% idle. The PRs I could find in a quick search are
kern/21155, kern/23448 and kern/27334.

The most probable cause for this effect is a synchonisation between
the load measurement and processes that periodically run for short
amounts of time. The load average is based on samples of the number
of running processes taken at exact 5-second intervals. If some
other process regularly runs with a period that divides into 5
seconds, that process may always be seen as running even though it
may only run for a tiny fraction of the available CPU time.

A very likely candidate process is bufdaemon; it sleeps for 1 second
at a time, so if it happens to get scheduled in the same tick as
the load measurement and before the load measurement, it will always
be seen as running.

The patch below causes the samples of running processes to be
somewhat randomised; instead of being taken every 5 seconds, the
gap now varies in the range 4 to 6 seconds, so that synchronisation
should no longer occur. Would there be any objections to my committing
this?

Two comments on the patch:
- This patch removes the SSLEEP case in loadav(), because in the
  existing code, p->p_slptime has always just been incremented in
  schedcpu() so this case never made a difference. To keep the same
  load average behaviour when loadav() is called at different times,
  this case needs to be removed.

- The load average calculation now has really nothing to do with
  the VM system, so it could be moved elsewhere. I've just left
  it in vm_meter.c because that's where it's always been.

Ian

Index: vm/vm_meter.c
===
RCS file: /dump/FreeBSD-CVS/src/sys/vm/vm_meter.c,v
retrieving revision 1.57
diff -u -r1.57 vm_meter.c
--- vm/vm_meter.c   2001/07/04 19:00:12 1.57
+++ vm/vm_meter.c   2001/07/15 20:54:38
@@ -53,8 +53,11 @@
 #include 
 #include 
 
+static void loadav_init(void);
+
 struct loadavg averunnable =
{ {0, 0, 0}, FSCALE };  /* load average, of runnable procs */
+static struct callout loadav_callout;
 
 struct vmmeter cnt;
 
@@ -75,19 +78,17 @@
  * 1, 5 and 15 minute intervals.
  */
 static void
-loadav(struct loadavg *avg)
+loadav(void *arg)
 {
int i, nrun;
+   struct loadavg *avg;
struct proc *p;
 
+   avg = (struct loadavg *)arg;
sx_slock(&allproc_lock);
-   for (nrun = 0, p = LIST_FIRST(&allproc); p != 0; p = LIST_NEXT(p, p_list)) {
+   for (nrun = 0, p = LIST_FIRST(&allproc); p != 0;
+p = LIST_NEXT(p, p_list)) {
switch (p->p_stat) {
-   case SSLEEP:
-   if (p->p_pri.pri_level > PZERO ||
-   p->p_slptime != 0)
-   continue;
-   /* FALLTHROUGH */
case SRUN:
if ((p->p_flag & P_NOLOAD) != 0)
continue;
@@ -100,15 +101,24 @@
for (i = 0; i < 3; i++)
avg->ldavg[i] = (cexp[i] * avg->ldavg[i] +
nrun * FSCALE * (FSCALE - cexp[i])) >> FSHIFT;
+
+   /*
+* Schedule the next update to occur in 5 seconds, but add a
+* random variation to help avoid synchronisation with
+* processes that run at regular intervals.
+*/
+   callout_reset(&loadav_callout, hz * 4 + (int)(random() % (hz * 2)),
+   loadav, arg);
 }
 
-void
-vmmeter()
+static void
+loadav_init()
 {
-
-   if (time_second % 5 == 0)
-   loadav(&averunnable);
+   callout_init(&loadav_callout, 0);
+   loadav(&averunnable);
 }
+SYSINIT(loadav, SI_SUB_PSEUDO, SI_ORDER_ANY, loadav_init, NULL)
+
 
 SYSCTL_UINT(_vm, VM_V_FREE_MIN, v_free_min,
CTLFLAG_RW, &cnt.v_free_min, 0, "");
Index: vm/vm_extern.h
===
RCS file: /dump/FreeBSD-CVS/src/sys/vm/vm_extern.h,v
retrieving revision 1.47
diff -u -r1.47 vm_extern.h
--- vm/vm_extern.h  2000/03/13 10:47:24 1.47
+++ vm/vm_extern.h  2001/07/15 20:36:14
@@ -84,7 +84,6 @@
 int vm_mmap __P((vm_map_t, vm_offset_t *, vm_size_t, vm_prot_t, vm_prot_t, int, void 
*, vm_ooffset_t));
 vm_offset_t vm_page_alloc_contig __P((vm_offset_t, vm_offset_t, vm_offset_t, 
vm_offset_t));
 void vm_set_page_size __P((void));
-void vmmeter __P((void));
 struct vmspace *vmspace_alloc __P((vm_offset_t, vm_offset_t));
 struct vmspace *vmspace_fork __P((struct vmspace *));
 void vmspace_exec __P((struct proc *));
Index: kern/kern_synch.c
===
RCS file: /dump/FreeBSD-CVS/src/sys/kern/kern_synch.c,v
retrieving revision 1.148
diff -u -r1.148 kern_synch.c
--- kern/kern_synch.c   2001/07/06 01:16:42 1.148
+++ kern/k

Re: disklabel broken again?

2001-07-13 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, Ian Dowse writes:
>In message <Pine.BSF.4.21.0107122138260.61694-10@beppo>, Matthew Jacob wri
>t
>es:
>>dd if=/dev/zero of=/dev/da5 bs=1024k count=10
>>...
>>disklabel -Brw da5 auto
>>disklabel: No space left on device
>
>I think this can happen when there is an existing label on the
>disk, but I forget the exact conditions. Try dd'ing a few k of
>zeros on to the disk and run the disklabel again?

Whoops, I'm not awake. Ignore that! :-)

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: disklabel broken again?

2001-07-13 Thread Ian Dowse


In message , Matthew Jacob writ
es:
>
>Sometime in the last few days, disklabel -Brw auto seems to have stopped
>working for me on alpha It used to be the thing of:

>Now I get:
>
>dd if=/dev/zero of=/dev/da5 bs=1024k count=10
>...
>disklabel -Brw da5 auto
>disklabel: No space left on device

I think this can happen when there is an existing label on the
disk, but I forget the exact conditions. Try dd'ing a few k of
zeros on to the disk and run the disklabel again?

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: MSDOS filesystem mounting...

2001-07-09 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Jim Bryant writes:
>
> 6:20:34pm  wahoo(113): mount_msdos /dev/da0s1 /ms-dog
>mount_msdos: vfsload(msdos): No such file or directory

Try "mount_msdosfs" instead of "mount_msdos". The latter is probably
a stale binary left on your system from before the rename that took
place last month.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: dirpref and RELENG_4 fsck

2001-06-03 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, Alfred Perlstein wri
tes:
>> Was it determined that the fsck corruption problems which were seen
>> with fsck after the introduction of the dirpref changes do not affect
>> RELENG_4?  I haven't seen any MFC of changes to the RELENG_4 fsck
>> code, and I'm kind of worried now that I've reverted my current system
>> back to RELENG_4 :-)
>
>Afaik the problem was that fsck would wipe certain stats info that
>dirpref would use, however I think the kernel detects absurd values
>and will reinit them.

Yes, there was a problem in the -current kernel that could cause
crashes if fsck erased some dirpref-related values in the superblock.

However, the main issue affecting moving filesystems back and forth
between RELENG_4 and -current is that fsck in RELENG_4 does not
know about the new superblock fields (dirpref, pending*, snapshots?).
It will detect a mismatch between the master and alternate superblocks
and give an error.  Once you fix that error (I think you just say
yes to the "LOOK FOR ALTERNATE SUPERBLOCKS?" question), then
everything should be fine.

One final annoyance is that using an alternate superblock will undo
any changes made by tunefs, unless the '-A' flag had been used with
tunefs originally. Typically this will result in soft-updates
getting disabled.

RELENG_4's fsck could probably be updated to deal with this a bit
better, but I don't think it can do the right thing if any snapshots
exist, so the error may be a good thing.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: cvs commit: src/sbin/fsck_ffs setup.c

2001-05-29 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Ian Dowse writes:
>iedowse 2001/05/29 13:45:09 PDT
>
>  Modified files:
>sbin/fsck_ffssetup.c 
>  Log:
>  Ignore the new superblock fields fs_pendingblocks and fs_pendinginodes
>  when comparing with the alternate superblock. These fields are used
>  for temporary in-core information only. This should fix the "VALUES
>  IN SUPER BLOCK DISAGREE WITH THOSE IN FIRST ALTERNATE" error from
>  fsck_ffs that has been seen a lot recently.

Note that this will not fix the softupdates freelist corruption
problem that people have been reporting. It seems that Kirk is away
for at least another week, so if Tor's suggested fix for that works,
then it should probably be committed in the meantime.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: wierdness with mountd

2001-05-29 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, Ian Dowse writes:
>error? (untested patch below). 

I braino'd that patch (error vs. errno), but I have just committed
a working version that should stop the mountd warnings.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: wierdness with mountd

2001-05-29 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, John writes:

>Looking in /usr/src/sbin/mountd/mountd.c, under line 930
>shows the following:
>
>num = getmntinfo(&fsp, MNT_NOWAIT);
>
>and then runs through a loop 'num' times trying to
>delete any export for each entry.

Thanks, you're right - this has nothing to do with mountdtab or
mounttab. The commit that caused these messages to appear is
phk's centralisation of the kernel netexport structure:

REV:1.149   ffs_vfsops.c2001/04/25 07:07:50   phk

   Move the netexport structure from the fs-specific mountstructure
   to struct mount.
   ...

Doing a MNT_DELEXPORT mount used to be a no-op if there were no
exports, but now it returns EINVAL. Maybe that should be changed
to ENOENT or something, so that mountd can detect it as a 'normal'
error? (untested patch below). 

Ian


Index: sys/kern/vfs_export.c
===
RCS file: /dump/FreeBSD-CVS/src/sys/kern/vfs_export.c,v
retrieving revision 1.310
diff -u -r1.310 vfs_export.c
--- sys/kern/vfs_export.c   2001/04/26 20:47:14 1.310
+++ sys/kern/vfs_export.c   2001/05/29 09:28:43
@@ -207,7 +207,7 @@
nep = mp->mnt_export;
if (argp->ex_flags & MNT_DELEXPORT) {
if (nep == NULL)
-   return (EINVAL);
+   return (ENOENT);
if (mp->mnt_flag & MNT_EXPUBLIC) {
vfs_setpublicfs(NULL, NULL, NULL);
mp->mnt_flag &= ~MNT_EXPUBLIC;
Index: sbin/mountd/mountd.c
===
RCS file: /dump/FreeBSD-CVS/src/sbin/mountd/mountd.c,v
retrieving revision 1.51
diff -u -r1.51 mountd.c
--- sbin/mountd/mountd.c2001/05/25 08:14:02 1.51
+++ sbin/mountd/mountd.c2001/05/29 09:31:43
@@ -903,6 +903,7 @@
struct xucred anon;
char *cp, *endcp, *dirp, *hst, *usr, *dom, savedc;
int len, has_host, exflags, got_nondir, dirplen, num, i, netgrp;
+   int error;
 
dirp = NULL;
dirplen = 0;
@@ -949,10 +950,11 @@
!strcmp(fsp->f_fstypename, "cd9660")) {
targs.ua.fspec = NULL;
targs.ua.export.ex_flags = MNT_DELEXPORT;
-   if (mount(fsp->f_fstypename, fsp->f_mntonname,
- fsp->f_flags | MNT_UPDATE,
- (caddr_t)&targs) < 0)
-   syslog(LOG_ERR, "can't delete exports for %s",
+   error = mount(fsp->f_fstypename, fsp->f_mntonname,
+   fsp->f_flags | MNT_UPDATE, (caddr_t)&targs);
+   if (error && error != ENOENT)
+   syslog(LOG_ERR,
+   "can't delete exports for %s: %m",
fsp->f_mntonname);
}
fsp++;

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: wierdness with mountd

2001-05-28 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, John Polstra writes:
>In article <[EMAIL PROTECTED]>,
>Matthew Jacob  <[EMAIL PROTECTED]> wrote:
>> May 28 10:21:43 farrago mountd[217]: can't delete exports for /tmp
>> May 28 10:21:43 farrago mountd[217]: can't delete exports for /usr/obj
>
>I've been seeing this too, on a -current system from around May 5.

This sounds like there are stale entries in /var/db/mountdtab, but
I'm not familiour enough with the purpose of mountdtab to know why
this is happening. I'll look into this further over the next few
days; for now maybe try cleaning out mountdtab manually?

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Strangeness with newsyslog/wtmp

2001-05-01 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, "Thomas D. Dean" write
s:
>I notice that my /var/log/wtmp has strange renewal times.  I don't
>know when it was not like this.  newsyslog.conf is set to renew this
>once per week.  What is causing this?

-rw-rw-r--  1 root  wheel   27 Apr 15 12:00 /var/log/wtmp.3.gz
-rw-rw-r--  1 root  wheel  244 Apr 13 15:52 /var/log/wtmp.4.gz
-rw-rw-r--  1 root  wheel  176 Apr  8 12:12 /var/log/wtmp.5.gz
-rw-rw-r--  1 root  wheel  148 Apr  3 10:51 /var/log/wtmp.6.gz
-rw-rw-r--  1 root  wheel  280 Mar 30 21:16 /var/log/wtmp.7.gz

Gzip by default preserves the last-modified time of a file when
gzipping, so these times are actually the times at which the wtmp
file was previously modified before being rotated.

Try "ls -lc", which will show up the rotation time.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: kernel core

2001-04-23 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Ian Dowse writes:
>You could try this (untested). I have to run now, but I can test it
>later as it's easy enough to reproduce.

Almost, but I missed the fs_contigdirs field, which was the real
culprit. An updated patch is below; this seems to stop the panics
for me. I'll just run this by Kirk first, and commit it if he has
no objections.

There probably does need to be something in UPDATING saying that
after the dirpref changes have been used, running a pre-dirpref
version of fsck may generate some serious-looking warnings that
are actually harmless. I think some people were seeing:

VALUES IN SUPER BLOCK DISAGREE WITH THOSE IN FIRST ALTERNATE

Is that correct? And was a "fsck -b 32 /dev/xxx" required to fix it
or did fsck correct the problem itself?

Ian

Index: ffs_vfsops.c
===
RCS file: /dump/FreeBSD-CVS/src/sys/ufs/ffs/ffs_vfsops.c,v
retrieving revision 1.146
diff -u -r1.146 ffs_vfsops.c
--- ffs_vfsops.c2001/04/17 05:37:51 1.146
+++ ffs_vfsops.c2001/04/23 23:37:14
@@ -421,12 +421,18 @@
 */
newfs->fs_csp = fs->fs_csp;
newfs->fs_maxcluster = fs->fs_maxcluster;
+   newfs->fs_contigdirs = fs->fs_contigdirs;
bcopy(newfs, fs, (u_int)fs->fs_sbsize);
if (fs->fs_sbsize < SBSIZE)
bp->b_flags |= B_INVAL | B_NOCACHE;
brelse(bp);
mp->mnt_maxsymlinklen = fs->fs_maxsymlinklen;
ffs_oldfscompat(fs);
+   /* An old fsck may have zeroed these fields, so recheck them. */
+   if (fs->fs_avgfilesize <= 0)/* XXX */
+   fs->fs_avgfilesize = AVFILESIZ; /* XXX */
+   if (fs->fs_avgfpdir <= 0)   /* XXX */
+   fs->fs_avgfpdir = AFPDIR;   /* XXX */

/*
 * Step 3: re-read summary information from disk.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: kernel core

2001-04-23 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, John Baldwin writes:
>
>
>Fair enough, I guess ffs_reload() should just sanity check the values.  Any
>takers?

You could try this (untested). I have to run now, but I can test it
later as it's easy enough to reproduce.

Ian

Index: ffs_vfsops.c
===
RCS file: /dump/FreeBSD-CVS/src/sys/ufs/ffs/ffs_vfsops.c,v
retrieving revision 1.146
diff -u -r1.146 ffs_vfsops.c
--- ffs_vfsops.c2001/04/17 05:37:51 1.146
+++ ffs_vfsops.c2001/04/23 22:15:55
@@ -427,6 +427,11 @@
brelse(bp);
mp->mnt_maxsymlinklen = fs->fs_maxsymlinklen;
ffs_oldfscompat(fs);
+   /* An old fsck may have clobbered these fields, so recheck them. */
+   if (fs->fs_avgfilesize <= 0)/* XXX */
+   fs->fs_avgfilesize = AVFILESIZ; /* XXX */
+   if (fs->fs_avgfpdir <= 0)   /* XXX */
+   fs->fs_avgfpdir = AFPDIR;   /* XXX */
 
/*
 * Step 3: re-read summary information from disk.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: kernel core

2001-04-23 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Warner Losh writes:
>: 
>: Yes, but until such time as we do that we should warn people in UPDATING at
>: least.
>: 
>
>OK, but you won't like the UPDATING entry.

The bug actually looks fairly simple to fix. ffs_reload() isn't
checking if the new superblock fields are zero, so if an old fsck
zeros them out between a read-oly mount and a read-write remount,
then we get a division by zero or something later.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

fsdb broken in -current

2001-04-23 Thread Ian Dowse



The last set of changes to fsck_ffs moved the initialisation of
dev_bsize to sblock_init(), but this is not called by fsdb(8) so
fsdb dies almost immediately with a floating exception. I'm just
going to commit the obvious fix, which is to have fsdb call
sblock_init() also.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

reboot(8) delay between SIGTERM and SIGKILL

2001-03-19 Thread Ian Dowse



I have noticed that reboot(8) sometimes appears not to wait long
enough before sending the final SIGKILL to all processes. On a
system that has a lot of processes swapped out, some processes such
as the X server may get a SIGKILL before they have had a chance to
perform their exit cleanup.

The patch below causes reboot to wait up to 60 seconds for paging
activity to end before sending the SIGKILLs. It does this by
monitoring the sysctl `vm.stats.vm.v_swappgsian', and extending
the default 5-second delay if page-in operations are observed.

On my laptop (64Mb, IDE disk) with a number of big apps running,
it can take around 20 seconds for all the paging to die down after
the SIGTERMs are sent.

I know the choice of sysctl to monitor is slightly arbitrary, but
it seems to have the right overall effect. Does anyone have any
objections to my committing this?

Ian

Index: reboot.c
===
RCS file: /dump/FreeBSD-CVS/src/sbin/reboot/reboot.c,v
retrieving revision 1.9
diff -u -r1.9 reboot.c
--- reboot.c1999/11/21 21:52:40 1.9
+++ reboot.c2001/03/19 17:01:37
@@ -47,6 +47,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -58,6 +59,7 @@
 #include 
 
 void usage __P((void));
+u_int get_pageins __P((void));
 
 int dohalt;
 
@@ -152,13 +154,22 @@
/*
 * After the processes receive the signal, start the rest of the
 * buffers on their way.  Wait 5 seconds between the SIGTERM and
-* the SIGKILL to give everybody a chance.
+* the SIGKILL to give everybody a chance. If there is a lot of
+* paging activity then wait longer, up to a maximum of approx
+* 60 seconds.
 */
sleep(2);
if (!nflag)
sync();
-   sleep(3);
+   for (i = 0; i < 20; i++) {
+   u_int old_pageins;
 
+   old_pageins = get_pageins();
+   sleep(3);
+   if (get_pageins() == old_pageins)
+   break;
+   }
+
for (i = 1;; ++i) {
if (kill(-1, SIGKILL) == -1) {
if (errno == ESRCH)
@@ -189,4 +200,19 @@
(void)fprintf(stderr, "usage: %s [-dnpq]\n",
dohalt ? "halt" : "reboot");
exit(1);
+}
+
+u_int
+get_pageins()
+{
+   u_int pageins;
+   size_t len;
+
+   len = sizeof(pageins);
+   if (sysctlbyname("vm.stats.vm.v_swappgsin", &pageins, &len, NULL, 0)
+   != 0) {
+   warnx("v_swappgsin");
+   return (0);
+   }
+   return pageins;
 }

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: growfs

2001-03-18 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Andrea Campi writes:
>
>Anyway, that was not my point. If I reboot into single-user, and am thus sure
>to have the / fs in a clean, consistent state, should I expect growfs to work
>in a safe way? If so, we should document it.

I think it is still unlikely to be completely safe. The kernel may
panic if it finds inconsistencies in the filesystem, and I'm sure
that growfs (temporarily) introduces some very serious inconsistencies
while it is running. Also, when growfs completes, the kernel's idea
of the filesystem is quite different from the parameters actually
set on the disk.

If the kernel was to panic half-way through a growfs operation, or
if growfs died, say because the kernel failed to fault in some
pages from the growfs executable, you could end up with a very
confused filesystem!

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Fw: Stop annoying message of lnc

2001-03-18 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Mike Smith writes:
>
>I don't quite understand Paul's reasoning, though; it's not actually 
>useful to unload/reload parts of a device's bus attachment without 
>unloading/reloading all the downstream parts of the driver.
>
>I think the fix should probably be committed and the driver turned into a 
>single monolithic module.

Yes, Paul essentially agreed to my doing this as an interim measure
until ifconfig is "fixed" to use the module file name rather than
the module name when loading drivers. I'll commit the change in a
few hours after I have tested that it works.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Kernel Panic from Yesterday's CVSup

2001-02-07 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, John Baldwin writes:

>> malloc(48,c0238100,0,c65feb80,0) at malloc+0x2a
>> exit1(c65feb80,0,0,c6623f78,c01fc852) at exit1+0x1b1
>> kthread_suspend(0,c0279a40,0,c022d1ec,a2) at kthread_suspend
>> ithd_loop(0,c6623fa8) at ithd_loop+0x56
>> fork_exit(c01fc7fc,0,c6623fa8) at fork_exit+0x8
>> fork_trampoline() at fork_trampoline+0x8
>> db> witness_list
>> "Giant" (0xc0279a40) locked at ../../i386/isa/ithread.c:162
>
>Erm, ithd_loop() doesn't call kthread_suspend().  *sigh*.  Something
>else is rather messed up here I'm afraid.

Note that the return address into kthread_suspend is kthread_suspend+0x0.
Since the call to exit1() in kthread_exit is the very last operation
in kthread_exit, you'd expect the return address on the stack to be
at the start of the next function...

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Can't make installworld :(

2000-06-09 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Hosta
s Red writes:
>I can't make installworld for some time with following message:
>
>vm/vm_object.h -> vm/vm_object.ph
>vm/vm_page.h -> vm/vm_page.ph
>vm/vm_pageout.h -> vm/vm_pageout.ph
>vm/vm_pager.h -> vm/vm_pager.ph
>vm/vm_param.h -> vm/vm_param.ph
>vm/vm_prot.h -> vm/vm_prot.ph
>vm/vm_zone.h -> vm/vm_zone.ph
>vm/vnode_pager.h -> vm/vnode_pager.ph
>*** Error code 1

I've seen this before. h2ph will return a non-zero exit status if it
failed to open _any_ of the files listed on the command line. This
will typically happen if you have a dangling symbolic link somewhere
in /usr/include. The error message indicating exactly which files
h2ph couldn't open will be somewhere among all the 'XX.h -> XX.ph'
messages.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: Best NIC for FBSD (was: Buffer Problems and hangs in 4.0-CURRENT..)

2000-03-15 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Mike Smith writes:
>> fxp0:  The Intel driver is by far the highest preformance model,
>> beats the 3com (second best) hands down with much lower CPU 
>> overhead.
>
>Do you actually have any numbers to quantify this?  There's nothing in 
>the driver architecture nor any of my testing that would suggest this is 
>actually the case at this point.

The FreeBSD fxp driver does a lot to reduce the number of transmit
interrupts; only 1/120 of transmitted packets result in interrupts. See
the code relating to FXP_CXINT_THRESH.

Assuming an even balance of transmitted and received packets, this should
reduce the total number of interrupts by nearly 50%. I don't know if
drivers for other cards do (or even can) use this approach.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

1 2 >

1 - 100 of 104 matches

Mail list logo