Re: horribly slow fsck_ffs pass1 performance

2011-03-31 Thread Otto Moerbeek
On Fri, Apr 01, 2011 at 06:54:37AM +0200, Otto Moerbeek wrote:

> > A request to tech@,
> > Can you please consider architecture wise bumps in filesystem defaults
> > to data block size and fragment size such that inodes to be scanned
> > are reduced? When you run fsck you really want it to be correct and if
> > possible faster? I don't know when this fsck diff will get integrated
> > (it is early in cycle!!! wink wink) but defaults can get bumped after
> > discussion.
> > 
> > Thanks in advance
> 
> I think the defaults are reasonable. 

Some data points to stress that.

Filesystem SizeUsed   Avail Capacity iused   ifree  %iused Mounted on
/dev/sd0i  1.3G772M479M62%   88579   9330749%   /usr/src
/dev/sd1e  133G2.6G124G 2%  265879 8559975 3%   /scratch

A small (16k block) filedssytem and a largish (32k block) filesystem.
All created with defaults. The /scratch filesystem has a few src tree copies
on it.

You see that if I would create less inodes, in both cases I would run
the risk of running out of inodes before running out of space. That
would suck.

In your particular case it might work, but the defaults assume that
running out of inodes is a really bad thing you want to avoid.

Also, moving to larger blocks wastes more space. So that also a reason
to be conservative.

-Otto



Re: horribly slow fsck_ffs pass1 performance

2011-03-31 Thread Otto Moerbeek
On Thu, Mar 31, 2011 at 10:11:37PM -0500, Amit Kulkarni wrote:

> Hi Otto,
> 
> The speedup is very considerable! Thanks a ton for this diff! I didn't
> find any problems running it on all my fs. I cannot get fsck -p, maybe
> you have to run in single user mode? But still normal fsck works on
> all partitions and with much more speed than before. I changed all my
> large partitions to have -b 65536 -f 8192.

Please tell us more why -p does not work. What happens if you try it.
Be more exact in your description.

So why is you system extra slow? Maybe it has too little memory and
starts swapping. Some details might come in handy.

> 
> A request to tech@,
> Can you please consider architecture wise bumps in filesystem defaults
> to data block size and fragment size such that inodes to be scanned
> are reduced? When you run fsck you really want it to be correct and if
> possible faster? I don't know when this fsck diff will get integrated
> (it is early in cycle!!! wink wink) but defaults can get bumped after
> discussion.
> 
> Thanks in advance

I think the defaults are reasonable. 

-Otto



Re: horribly slow fsck_ffs pass1 performance

2011-03-31 Thread Amit Kulkarni
Hi Otto,

The speedup is very considerable! Thanks a ton for this diff! I didn't
find any problems running it on all my fs. I cannot get fsck -p, maybe
you have to run in single user mode? But still normal fsck works on
all partitions and with much more speed than before. I changed all my
large partitions to have -b 65536 -f 8192.

A request to tech@,
Can you please consider architecture wise bumps in filesystem defaults
to data block size and fragment size such that inodes to be scanned
are reduced? When you run fsck you really want it to be correct and if
possible faster? I don't know when this fsck diff will get integrated
(it is early in cycle!!! wink wink) but defaults can get bumped after
discussion.

Thanks in advance

On Thu, Mar 31, 2011 at 3:14 PM, Otto Moerbeek  wrote:
>
> So here's an initial, only lightly tested diff.
>
> Beware, this very well could eat your filesystems.
>
> To note any difference, you should use the -p mode of fsck_ffs (rc
> does that) and the fs should have been mounted with softdep.
>
> I have seen very nice speedups already.
>
>-Otto



Re: horribly slow fsck_ffs pass1 performance

2011-03-31 Thread Amit Kulkarni
> I dont think we want to change thed default density. Larger
> parttitions already gets larger blocks and fragment, and as a
> consequence lower number of inodes.
>

>> Otto,
>> In my tests on AMD64, if FFS partition size increases beyond 30GB,
>> fsck starts taking exponential time even if you have zero used inodes.
>> This is a for i () for j() loop and if you reduce the for j() inner
>> loop it is a win.
>
> Yes, it becomes very slow, but I don't think it is exponential.

Wo, even with ***existing code*** because I did a newfs -b 65536
-f 8192 wd0m (this has an implicit -i 32768)

fsck chewed through a 80G partition with 2 clang static analyzer runs
(2100 files of 200 Kb each) within 1 minute. When before this, it
never went past pass1 for over 5 hours.

Insanely fast fsck runs. Thanks Stuart and Otto. Why don't you make
the newfs default? What does everybody say?
newfs -b 65536 -f 8192 -i 32768

Somebody ought to change the section in FAQ too.!!

I will try out your diff right now.

>>
>> dumpfs -m /downloads
>> # newfs command for /dev/wd0o
>> newfs -O 1 -b 16384 -e 4096 -f 2048 -g 16384 -h 64 -m 5 -o time -s
>> 172714816 /dev/wd0o
>>
>> So, if I read it correctly, setting just the block size higher to say
>> 64Kb does auto tune frag size to 1/8 which is 8Kb (newfs complains
>> appropriately) but the auto tune inode length to 4 times frag which is
>> 32Kb is not implemented now? Is this the proposed formula?
>
> There's no such thing as inode length.
>

Sorry what I meant was the size required to consider storing a single inode?



Re: horribly slow fsck_ffs pass1 performance

2011-03-31 Thread Amit Kulkarni
>> If you really have a lot of used inodes, skipping the unused ones
>> isn't going to help :-)
>>
>> You could always build your large-sized filesystems with a larger
>> value of bytes-per-inode. newfs -i 32768 or 65536 is good for common
>> filesystem use patterns with larger partitions (for specialist uses
>> e.g. storing backups as huge single files it might be appropriate
>> to go even higher).
>

Stuart,

Thanks for the tip. But I can verify when I did lookup my 80G
filesystem it is currently not specifying -i, so it is 8Kb per a
single inode (it is 4 times frag size per your update to newfs man
page). This is a no brainer optimization which can get huge wins in
fsck immediately without too much change in the existing code.

Otto,
In my tests on AMD64, if FFS partition size increases beyond 30GB,
fsck starts taking exponential time even if you have zero used inodes.
This is a for i () for j() loop and if you reduce the for j() inner
loop it is a win.

dumpfs -m /downloads
# newfs command for /dev/wd0o
newfs -O 1 -b 16384 -e 4096 -f 2048 -g 16384 -h 64 -m 5 -o time -s
172714816 /dev/wd0o

So, if I read it correctly, setting just the block size higher to say
64Kb does auto tune frag size to 1/8 which is 8Kb (newfs complains
appropriately) but the auto tune inode length to 4 times frag which is
32Kb is not implemented now? Is this the proposed formula?

If a user tunes -i inodes, or -f frags or -b block size, it should all
auto-adjust to the same outcome based on above formula in the future?

dumpfs doesn't show the total inodes or the inode length in a easily
readable format (-m option). Just trying to understand what the
acronyms mean.

Thanks

> disklabel has code already to move to larger block and frag sizes for
> large (new) partitions. newfs picks these settings up.
>
>
>>
>> Of course this does involve dump/restore if you need to do this for
>> an existing filesystem.
>>
>> > It is interesting because it really speeds up fsck_ffs for filesystems
>> > with few used inodes.
>> >
>> > There's also a dangerous part: it assumes the cylinder group summary
>> > info is ok when softdeps has been used.
>> >
>> > I suppose that's the reason why it was never included into OpenBSD.
>> >
>> > I'll ponder if I want to work on this.
>>

>> A safer alternative to this optimization might be for the installer
>> (or newfs) to consider the fs size when deciding on a default inode
>> density.
>
>-Otto



Re: horribly slow fsck_ffs pass1 performance

2011-03-31 Thread Otto Moerbeek
On Thu, Mar 31, 2011 at 10:12:07PM +0200, Otto Moerbeek wrote:


> > So, if I read it correctly, setting just the block size higher to say
> > 64Kb does auto tune frag size to 1/8 which is 8Kb (newfs complains
> > appropriately) but the auto tune inode length to 4 times frag which is
> > 32Kb is not implemented now? Is this the proposed formula?
> 
> There's no such thing as inode length. 
> 
> > 
> > If a user tunes -i inodes, or -f frags or -b block size, it should all
> > auto-adjust to the same outcome based on above formula in the future?
> 
> I don't see any formula.

Ah, now I understand what yoy mean by formula.

The rule is: if no -i parameter is given it's value is computed by 
4 * fragment size.

Default values for -b and -f are taken from the disklabel.
disklabel(8) in -E modes fills them in based on fs partition size. If
you specify -f or -b with newfs, these values override the values in
the label, and the label will be updated after the newfs. So the next
time you do a newfs, you'll re-use the last values for -b and -f. 

-Otto



Re: horribly slow fsck_ffs pass1 performance

2011-03-31 Thread Otto Moerbeek
On Thu, Mar 31, 2011 at 10:14:46PM +0200, Otto Moerbeek wrote:

> So here's an initial, only lightly tested diff.
> 
> Beware, this very well could eat your filesystems.
> 
> To note any difference, you should use the -p mode of fsck_ffs (rc
> does that) and the fs should have been mounted with softdep.
> 
> I have seen very nice speedups already.

But don't count yourself a rich man too soon: for ffs2 filesystesm,
you won't see a lot of speedup, because inode blocks are allocated
on-demand there, so a filesystem with few inodes used likely has few
inode blocks. 

Also, depending on the usage patterns, you might have a fs where high
numbered inodes are used, while the fs itself is pretty empty. Filling
up a fs with lots of files and them removing a lot of them is an
example that could lead to such a situation. This diff does not speed
things up in such cases.

-Otto



Re: horribly slow fsck_ffs pass1 performance

2011-03-31 Thread Otto Moerbeek
So here's an initial, only lightly tested diff.

Beware, this very well could eat your filesystems.

To note any difference, you should use the -p mode of fsck_ffs (rc
does that) and the fs should have been mounted with softdep.

I have seen very nice speedups already.

-Otto

Index: dir.c
===
RCS file: /cvs/src/sbin/fsck_ffs/dir.c,v
retrieving revision 1.24
diff -u -p -r1.24 dir.c
--- dir.c   27 Oct 2009 23:59:32 -  1.24
+++ dir.c   31 Mar 2011 08:30:36 -
@@ -443,8 +443,8 @@ linkup(ino_t orphan, ino_t parentdir)
idesc.id_type = ADDR;
idesc.id_func = pass4check;
idesc.id_number = oldlfdir;
-   adjust(&idesc, lncntp[oldlfdir] + 1);
-   lncntp[oldlfdir] = 0;
+   adjust(&idesc, ILNCOUNT(oldlfdir) + 1);
+   ILNCOUNT(oldlfdir) = 0;
dp = ginode(lfdir);
}
if (GET_ISTATE(lfdir) != DFOUND) {
@@ -457,7 +457,7 @@ linkup(ino_t orphan, ino_t parentdir)
printf("\n\n");
return (0);
}
-   lncntp[orphan]--;
+   ILNCOUNT(orphan)--;
if (lostdir) {
if ((changeino(orphan, "..", lfdir) & ALTERED) == 0 &&
parentdir != (ino_t)-1)
@@ -465,7 +465,7 @@ linkup(ino_t orphan, ino_t parentdir)
dp = ginode(lfdir);
DIP_SET(dp, di_nlink, DIP(dp, di_nlink) + 1);
inodirty();
-   lncntp[lfdir]++;
+   ILNCOUNT(lfdir)++;
pwarn("DIR I=%u CONNECTED. ", orphan);
if (parentdir != (ino_t)-1) {
printf("PARENT WAS I=%u\n", parentdir);
@@ -476,7 +476,7 @@ linkup(ino_t orphan, ino_t parentdir)
 * fixes the parent link count so that fsck does
 * not need to be rerun.
 */
-   lncntp[parentdir]++;
+   ILNCOUNT(parentdir)++;
}
if (preen == 0)
printf("\n");
@@ -636,7 +636,7 @@ allocdir(ino_t parent, ino_t request, in
DIP_SET(dp, di_nlink, 2);
inodirty();
if (ino == ROOTINO) {
-   lncntp[ino] = DIP(dp, di_nlink);
+   ILNCOUNT(ino) = DIP(dp, di_nlink);
cacheino(dp, ino);
return(ino);
}
@@ -650,8 +650,8 @@ allocdir(ino_t parent, ino_t request, in
inp->i_dotdot = parent;
SET_ISTATE(ino, GET_ISTATE(parent));
if (GET_ISTATE(ino) == DSTATE) {
-   lncntp[ino] = DIP(dp, di_nlink);
-   lncntp[parent]++;
+   ILNCOUNT(ino) = DIP(dp, di_nlink);
+   ILNCOUNT(parent)++;
}
dp = ginode(parent);
DIP_SET(dp, di_nlink, DIP(dp, di_nlink) + 1);
Index: extern.h
===
RCS file: /cvs/src/sbin/fsck_ffs/extern.h,v
retrieving revision 1.10
diff -u -p -r1.10 extern.h
--- extern.h25 Jun 2007 19:59:55 -  1.10
+++ extern.h31 Mar 2011 11:56:53 -
@@ -54,6 +54,7 @@ int   ftypeok(union dinode *);
 void   getpathname(char *, size_t, ino_t, ino_t);
 void   inocleanup(void);
 void   inodirty(void);
+struct inostat *inoinfo(ino_t);
 intlinkup(ino_t, ino_t);
 intmakeentry(ino_t, ino_t, char *);
 void   pass1(void);
Index: fsck.h
===
RCS file: /cvs/src/sbin/fsck_ffs/fsck.h,v
retrieving revision 1.23
diff -u -p -r1.23 fsck.h
--- fsck.h  10 Jun 2008 23:10:29 -  1.23
+++ fsck.h  31 Mar 2011 11:55:42 -
@@ -66,6 +66,19 @@ union dinode {
 #define BUFSIZ 1024
 #endif
 
+/*
+ * Each inode on the file system is described by the following structure.
+ * The linkcnt is initially set to the value in the inode. Each time it
+ * is found during the descent in passes 2, 3, and 4 the count is
+ * decremented. Any inodes whose count is non-zero after pass 4 needs to
+ * have its link count adjusted by the value remaining in ino_linkcnt.
+ */
+struct inostat {
+   charino_state;  /* state of inode, see below */
+   charino_type;   /* type of inode */
+   short   ino_linkcnt;/* number of links not found */
+};
+
 #defineUSTATE  01  /* inode not allocated */
 #defineFSTATE  02  /* inode is file */
 #defineDSTATE  03  /* inode is directory */
@@ -73,12 +86,20 @@ union dinode {
 #defineDCLEAR  05  /* directory is to be cleared */
 #defineFCLEAR  06  /* file is to be cleared */
 
-#define GET_ISTATE(ino)(stmap[(ino)] & 0xf)
-#define GET_ITYPE(ino) (stmap[(ino)] >> 4)
-#define SET_ISTATE(ino, v) do { stmap[(ino)] = (stmap[(ino)] & 0xf0) | \
-   ((v) & 0xf); } while (0)
-#define SET_ITYPE

Re: horribly slow fsck_ffs pass1 performance

2011-03-31 Thread Otto Moerbeek
On Thu, Mar 31, 2011 at 02:50:36PM -0500, Amit Kulkarni wrote:

> >>
> >> If you really have a lot of used inodes, skipping the unused ones
> >> isn't going to help :-)
> >>
> >> You could always build your large-sized filesystems with a larger
> >> value of bytes-per-inode. newfs -i 32768 or 65536 is good for common
> >> filesystem use patterns with larger partitions (for specialist uses
> >> e.g. storing backups as huge single files it might be appropriate
> >> to go even higher).
> >
> 
> Stuart,
> 
> Thanks for the tip. But I can verify when I did lookup my 80G
> filesystem it is currently not specifying -i, so it is 8Kb per a
> single inode (it is 4 times frag size per your update to newfs man
> page). This is a no brainer optimization which can get huge wins in
> fsck immediately without too much change in the existing code.

I dont think we want to change thed default density. Larger
parttitions already gets larger blocks and fragment, and as a
consequence lower number of inodes.

> Otto,
> In my tests on AMD64, if FFS partition size increases beyond 30GB,
> fsck starts taking exponential time even if you have zero used inodes.
> This is a for i () for j() loop and if you reduce the for j() inner
> loop it is a win.

Yes, it becomes very slow, but I don't think it is exponential.

> 
> dumpfs -m /downloads
> # newfs command for /dev/wd0o
> newfs -O 1 -b 16384 -e 4096 -f 2048 -g 16384 -h 64 -m 5 -o time -s
> 172714816 /dev/wd0o
> 
> So, if I read it correctly, setting just the block size higher to say
> 64Kb does auto tune frag size to 1/8 which is 8Kb (newfs complains
> appropriately) but the auto tune inode length to 4 times frag which is
> 32Kb is not implemented now? Is this the proposed formula?

There's no such thing as inode length. 

> 
> If a user tunes -i inodes, or -f frags or -b block size, it should all
> auto-adjust to the same outcome based on above formula in the future?

I don't see any formula.

If you feel you have too many inodes, you can use a larger -i, -b and or -f
For newly created partitions, newfs will pickup larger -b and -f from
the disklabel entry. If you still want less inodes, increase -f, -b or
-i further.

> 
> dumpfs doesn't show the total inodes or the inode length in a easily
> readable format (-m option). Just trying to understand what the
> acronyms mean.

You want toal inodes = ng * ipg (number of cylinder groups * inode per
group) in the dumpfs header. I have no idea what you mean by inode length.

-Otto



wol for xl(4)

2011-03-31 Thread Stefan Sperling
This is an attempt to add wol support to xl(4).

Unfortunately, while I have an xl(4) card to test with none of the
motherboards I have will do WOL with it since they all lack an
on-board WOL connector :(

So test reports are needed.
Please also check whether WOL is disabled by default.

Index: ic/xl.c
===
RCS file: /cvs/src/sys/dev/ic/xl.c,v
retrieving revision 1.99
diff -u -p -r1.99 xl.c
--- ic/xl.c 22 Sep 2010 08:49:14 -  1.99
+++ ic/xl.c 31 Mar 2011 15:48:36 -
@@ -191,6 +191,9 @@ void xl_testpacket(struct xl_softc *);
 int xl_miibus_readreg(struct device *, int, int);
 void xl_miibus_writereg(struct device *, int, int, int);
 void xl_miibus_statchg(struct device *);
+#ifndef SMALL_KERNEL
+int xl_wol(struct ifnet *, int);
+#endif
 
 int
 xl_activate(struct device *self, int act)
@@ -2368,6 +2371,12 @@ xl_stop(struct xl_softc *sc)
ifp->if_flags &= ~(IFF_RUNNING | IFF_OACTIVE);
 
xl_freetxrx(sc);
+
+#ifndef SMALL_KERNEL
+   /* Call upper layer WOL power routine if WOL is enabled. */
+   if ((sc->xl_flags & XL_FLAG_WOL) && sc->wol_power)
+   sc->wol_power(sc->wol_power_arg);
+#endif
 }
 
 void
@@ -2637,6 +2646,15 @@ xl_attach(struct xl_softc *sc)
CSR_WRITE_2(sc, XL_W0_MFG_ID, XL_NO_XCVR_PWR_MAGICBITS);
}
 
+#ifndef SMALL_KERNEL
+   /* Check availability of WOL. */
+   if ((sc->xl_caps & XL_CAPS_PWRMGMT) != 0) {
+   ifp->if_capabilities |= IFCAP_WOL;
+   ifp->if_wol = xl_wol;
+   xl_wol(ifp, 0);
+   }
+#endif
+
/*
 * Call MI attach routines.
 */
@@ -2668,6 +2686,24 @@ xl_detach(struct xl_softc *sc)
 
return (0);
 }
+
+#ifndef SMALL_KERNEL
+int
+xl_wol(struct ifnet *ifp, int enable)
+{
+   struct xl_softc *sc = ifp->if_softc;
+
+   XL_SEL_WIN(7);
+   if (enable) {
+   CSR_WRITE_2(sc, XL_W7_BM_PME, XL_BM_PME_MAGIC);
+   sc->xl_flags |= XL_FLAG_WOL;
+   } else {
+   CSR_WRITE_2(sc, XL_W7_BM_PME, 0);
+   sc->xl_flags &= ~XL_FLAG_WOL;
+   }
+   return (0); 
+}
+#endif
 
 struct cfdriver xl_cd = {
0, "xl", DV_IFNET
Index: ic/xlreg.h
===
RCS file: /cvs/src/sys/dev/ic/xlreg.h,v
retrieving revision 1.26
diff -u -p -r1.26 xlreg.h
--- ic/xlreg.h  21 Sep 2010 01:05:12 -  1.26
+++ ic/xlreg.h  31 Mar 2011 15:42:36 -
@@ -411,6 +411,12 @@
 #define XL_W7_BM_LEN   0x06
 #define XL_W7_BM_STATUS0x0B
 #define XL_W7_BM_TIMEr 0x0A
+#define XL_W7_BM_PME   0x0C
+
+#defineXL_BM_PME_WAKE  0x0001
+#defineXL_BM_PME_MAGIC 0x0002
+#defineXL_BM_PME_LINKCHG   0x0004
+#defineXL_BM_PME_WAKETIMER 0x0008
 
 /*
  * bus master control registers
@@ -571,6 +577,7 @@ struct xl_mii_frame {
 #define XL_FLAG_NO_XCVR_PWR0x0080
 #define XL_FLAG_USE_MMIO   0x0100
 #define XL_FLAG_NO_MMIO0x0200
+#define XL_FLAG_WOL0x0400
 
 #define XL_NO_XCVR_PWR_MAGICBITS   0x0900
 
@@ -604,6 +611,8 @@ struct xl_softc {
caddr_t sc_listkva;
bus_dmamap_tsc_rx_sparemap;
bus_dmamap_tsc_tx_sparemap;
+   void (*wol_power)(void *);
+   void *wol_power_arg;
 };
 
 #define xl_rx_goodframes(x) \
@@ -740,6 +749,13 @@ struct xl_stats {
 #define XL_PSTATE_D3   0x0003
 #define XL_PME_EN  0x0010
 #define XL_PME_STATUS  0x8000
+
+/* Bits in the XL_PCI_PWRMGMTCAP register */
+#define XL_PME_CAP_D0  0x0800
+#define XL_PME_CAP_D1  0x1000
+#define XL_PME_CAP_D2  0x2000
+#define XL_PME_CAP_D3_HOT  0x4000
+#define XL_PME_CAP_D3_COLD 0x8000
 
 extern int xl_intr(void *);
 extern void xl_attach(struct xl_softc *);
Index: pci/if_xl_pci.c
===
RCS file: /cvs/src/sys/dev/pci/if_xl_pci.c,v
retrieving revision 1.34
diff -u -p -r1.34 if_xl_pci.c
--- pci/if_xl_pci.c 19 Sep 2010 09:22:58 -  1.34
+++ pci/if_xl_pci.c 31 Mar 2011 15:43:05 -
@@ -92,10 +92,14 @@ int xl_pci_match(struct device *, void *
 void xl_pci_attach(struct device *, struct device *, void *);
 int xl_pci_detach(struct device *, int);
 void xl_pci_intr_ack(struct xl_softc *);
+#ifndef SMALL_KERNEL
+void xl_pci_wol_power(void *);
+#endif
 
 struct xl_pci_softc {
struct xl_softc psc_softc;
pci_chipset_tag_t   psc_pc;
+   pcitag_tpsc_tag;
bus_size_t  psc_iosize;
bus_size_t  psc_funsize;
 };
@@ -156,9 +160,11 @@ xl_pci_attach(struct device *parent, str
u_int32_t command;
 
psc->psc_pc = pc;
+   psc->psc_tag = pa->pa_tag;
sc->sc_dmat = pa->pa_dmat;
 
sc->xl_fl

Re: additional bpf mtap for carp

2011-03-31 Thread Claudio Jeker
On Thu, Mar 31, 2011 at 03:38:37PM +0200, Mike Belopuhov wrote:
> bpf is not called on multicast/broadcast packets arriving to the carp
> interface.  this allows us to setup drop filters and allows tcpdump to
> show all the packets.
> 
> OK/not-OK?
> 
> Index: ip_carp.c
> ===
> RCS file: /home/cvs/src/sys/netinet/ip_carp.c,v
> retrieving revision 1.181
> diff -u -p -u -p -r1.181 ip_carp.c
> --- ip_carp.c 8 Mar 2011 22:53:28 -   1.181
> +++ ip_carp.c 31 Mar 2011 13:02:43 -
> @@ -1580,6 +1580,11 @@ carp_input(struct mbuf *m, u_int8_t *sho
>   if (m0 == NULL)
>   continue;
>   m0->m_pkthdr.rcvif = &vh->sc_if;
> +#if NBPFILTER > 0
> + if (vh->sc_if.if_bpf)
> + bpf_mtap_hdr(vh->sc_if.if_bpf, (char *)&eh,
> + ETHER_HDR_LEN, m0, BPF_DIRECTION_IN);
> +#endif
The packet accounting is missing as well. So add this
vh->sc_if.if_ipackets++;
and then the diff is OK claudio@
>   ether_input(&vh->sc_if, &eh, m0);
>   }
>   return (1);
> 

-- 
:wq Claudio



additional bpf mtap for carp

2011-03-31 Thread Mike Belopuhov
bpf is not called on multicast/broadcast packets arriving to the carp
interface.  this allows us to setup drop filters and allows tcpdump to
show all the packets.

OK/not-OK?

Index: ip_carp.c
===
RCS file: /home/cvs/src/sys/netinet/ip_carp.c,v
retrieving revision 1.181
diff -u -p -u -p -r1.181 ip_carp.c
--- ip_carp.c   8 Mar 2011 22:53:28 -   1.181
+++ ip_carp.c   31 Mar 2011 13:02:43 -
@@ -1580,6 +1580,11 @@ carp_input(struct mbuf *m, u_int8_t *sho
if (m0 == NULL)
continue;
m0->m_pkthdr.rcvif = &vh->sc_if;
+#if NBPFILTER > 0
+   if (vh->sc_if.if_bpf)
+   bpf_mtap_hdr(vh->sc_if.if_bpf, (char *)&eh,
+   ETHER_HDR_LEN, m0, BPF_DIRECTION_IN);
+#endif
ether_input(&vh->sc_if, &eh, m0);
}
return (1);



Re: NFS writes lock up system with -o tcp,-w32768

2011-03-31 Thread Claudio Jeker
On Wed, Mar 30, 2011 at 09:33:11PM +0200, Claudio Jeker wrote:
> On Wed, Mar 30, 2011 at 08:34:24PM +0200, Mark Kettenis wrote:
> > > Date: Tue, 29 Mar 2011 22:42:47 +0200
> > > From: Claudio Jeker 
> > > 
> > > Here is a possible fix. The problem was that because of the way NFS uses
> > > the socket API it did not turn of the sendbuffer scaling which reset the
> > > size of the socket back to 17376 bytes which is a no go when a buffer of
> > > more then 17k is generated by NFS. It is better to initialize the sb_wat
> > > in soreserve() which is called by NFS and all attach functions.
> > 
> > This no longer does the sbcheckreserve() dance though.  Is that alright?
> > 
> 
> The code that was there previously was a bit strange. Since when
> sb_hiwat == 0 is true then sb_wat is 0 as well. Additionally
> sbcheckreserve() would only cause the watermark to be set to the default
> which is tcp_sendspace/tcp_recvspace. So since sb_hiwat == 0 is never run.
> 

While digging myself deeper into this code I figured out that we may want
something a bit more like the following diff. It adds back the
sbcheckreserve() checks in TCP and the socket buffers inherit more from
their listening socket when creating a new socket because of a SYN.
FreeBSD does something similar in their sonewconn() function. It also
seems to follow the accept(2) man page more closely:
 The accept() call extracts the first connection request on
 the queue of pending connections, creates a new socket with the same
 properties of s, and allocates a new file descriptor for the socket.

Not sure if the SB_ASYNC sb_flags needs to be inherited as well.
-- 
:wq Claudio

Index: kern/uipc_socket2.c
===
RCS file: /cvs/src/sys/kern/uipc_socket2.c,v
retrieving revision 1.51
diff -u -p -r1.51 uipc_socket2.c
--- kern/uipc_socket2.c 24 Sep 2010 02:59:45 -  1.51
+++ kern/uipc_socket2.c 31 Mar 2011 14:27:31 -
@@ -176,8 +176,16 @@ sonewconn(struct socket *head, int conns
/*
 * Inherit watermarks but those may get clamped in low mem situations.
 */
+   if (soreserve(so, head->so_snd.sb_hiwat, head->so_rcv.sb_hiwat)) {
+   pool_put(&socket_pool, so);
+   return ((struct socket *)0);
+   }
so->so_snd.sb_wat = head->so_snd.sb_wat;
+   so->so_snd.sb_lowat = head->so_snd.sb_lowat;
+   so->so_snd.sb_timeo = head->so_snd.sb_timeo;
so->so_rcv.sb_wat = head->so_rcv.sb_wat;
+   so->so_rcv.sb_lowat = head->so_rcv.sb_lowat;
+   so->so_rcv.sb_timeo = head->so_rcv.sb_timeo;
 
soqinsque(head, so, soqueue);
if ((*so->so_proto->pr_usrreq)(so, PRU_ATTACH, NULL, NULL, NULL,
@@ -353,6 +361,8 @@ soreserve(struct socket *so, u_long sndc
goto bad;
if (sbreserve(&so->so_rcv, rcvcc))
goto bad2;
+   so->so_snd.sb_wat = sndcc;
+   so->so_rcv.sb_wat = rcvcc;
if (so->so_rcv.sb_lowat == 0)
so->so_rcv.sb_lowat = 1;
if (so->so_snd.sb_lowat == 0)
Index: netinet/tcp_usrreq.c
===
RCS file: /cvs/src/sys/netinet/tcp_usrreq.c,v
retrieving revision 1.105
diff -u -p -r1.105 tcp_usrreq.c
--- netinet/tcp_usrreq.c10 Oct 2010 22:02:50 -  1.105
+++ netinet/tcp_usrreq.c31 Mar 2011 13:42:56 -
@@ -652,16 +652,10 @@ tcp_attach(so)
struct inpcb *inp;
int error;
 
-   if (so->so_snd.sb_hiwat == 0 || so->so_rcv.sb_hiwat == 0) {
-   /* if low on memory only allow smaller then default buffers */
-   if (so->so_snd.sb_wat == 0 ||
-   sbcheckreserve(so->so_snd.sb_wat, tcp_sendspace))
-   so->so_snd.sb_wat = tcp_sendspace;
-   if (so->so_rcv.sb_wat == 0 ||
-   sbcheckreserve(so->so_rcv.sb_wat, tcp_recvspace))
-   so->so_rcv.sb_wat = tcp_recvspace;
-
-   error = soreserve(so, so->so_snd.sb_wat, so->so_rcv.sb_wat);
+   if (so->so_snd.sb_hiwat == 0 || so->so_rcv.sb_hiwat == 0 ||
+   sbcheckreserve(so->so_snd.sb_wat, tcp_sendspace) ||
+   sbcheckreserve(so->so_rcv.sb_wat, tcp_recvspace)) {
+   error = soreserve(so, tcp_sendspace, tcp_recvspace);
if (error)
return (error);
}



Re: horribly slow fsck_ffs pass1 performance

2011-03-31 Thread Marco Peereboom
On Thu, Mar 31, 2011 at 03:15:59PM +0100, Stuart Henderson wrote:
> On 2011/03/31 08:29, Marco Peereboom wrote:
> > On Thu, Mar 31, 2011 at 09:13:41AM +, Stuart Henderson wrote:
> > > On 2011-03-31, Otto Moerbeek  wrote:
> > > > On Wed, Mar 30, 2011 at 03:45:02PM -0500, Amit Kulkarni wrote:
> > > >> In fsck_ffs's pass1.c it just takes forever for large sized partitions
> > > >> and also if you have very high number of files stored on that
> > > >> partition (used inodes count goes high).
> > > 
> > > If you really have a lot of used inodes, skipping the unused ones
> > > isn't going to help :-)
> > > 
> > > You could always build your large-sized filesystems with a larger
> > > value of bytes-per-inode. newfs -i 32768 or 65536 is good for common
> > > filesystem use patterns with larger partitions (for specialist uses
> > > e.g. storing backups as huge single files it might be appropriate
> > > to go even higher).
> > 
> > So this helps a lot to reduce fsck however if you play a lot with the
> > "tuning" parameters the only thing you tune is less speed.  I played
> > quite a bit with the parameters and the results were always worse than
> > the defaults.
> 
> Typical fsck times on my large partitions holding e.g. music or video
> go down from hours to minutes. This is enough of a win that I really
> don't care whether it changes anything at runtime.

I think I didn't make my point correctly.  The parameters you suggested
work great.  Pretty much all the other ones do not.



Re: horribly slow fsck_ffs pass1 performance

2011-03-31 Thread Stuart Henderson
On 2011/03/31 08:29, Marco Peereboom wrote:
> On Thu, Mar 31, 2011 at 09:13:41AM +, Stuart Henderson wrote:
> > On 2011-03-31, Otto Moerbeek  wrote:
> > > On Wed, Mar 30, 2011 at 03:45:02PM -0500, Amit Kulkarni wrote:
> > >> In fsck_ffs's pass1.c it just takes forever for large sized partitions
> > >> and also if you have very high number of files stored on that
> > >> partition (used inodes count goes high).
> > 
> > If you really have a lot of used inodes, skipping the unused ones
> > isn't going to help :-)
> > 
> > You could always build your large-sized filesystems with a larger
> > value of bytes-per-inode. newfs -i 32768 or 65536 is good for common
> > filesystem use patterns with larger partitions (for specialist uses
> > e.g. storing backups as huge single files it might be appropriate
> > to go even higher).
> 
> So this helps a lot to reduce fsck however if you play a lot with the
> "tuning" parameters the only thing you tune is less speed.  I played
> quite a bit with the parameters and the results were always worse than
> the defaults.

Typical fsck times on my large partitions holding e.g. music or video
go down from hours to minutes. This is enough of a win that I really
don't care whether it changes anything at runtime.



Re: horribly slow fsck_ffs pass1 performance

2011-03-31 Thread Marco Peereboom
On Thu, Mar 31, 2011 at 09:13:41AM +, Stuart Henderson wrote:
> On 2011-03-31, Otto Moerbeek  wrote:
> > On Wed, Mar 30, 2011 at 03:45:02PM -0500, Amit Kulkarni wrote:
> >> In fsck_ffs's pass1.c it just takes forever for large sized partitions
> >> and also if you have very high number of files stored on that
> >> partition (used inodes count goes high).
> 
> If you really have a lot of used inodes, skipping the unused ones
> isn't going to help :-)
> 
> You could always build your large-sized filesystems with a larger
> value of bytes-per-inode. newfs -i 32768 or 65536 is good for common
> filesystem use patterns with larger partitions (for specialist uses
> e.g. storing backups as huge single files it might be appropriate
> to go even higher).

So this helps a lot to reduce fsck however if you play a lot with the
"tuning" parameters the only thing you tune is less speed.  I played
quite a bit with the parameters and the results were always worse than
the defaults.

> 
> Of course this does involve dump/restore if you need to do this for
> an existing filesystem.
> 
> > It is interesting because it really speeds up fsck_ffs for filesystems
> > with few used inodes.
> >
> > There's also a dangerous part: it assumes the cylinder group summary
> > info is ok when softdeps has been used. 
> >
> > I suppose that's the reason why it was never included into OpenBSD.
> >
> > I'll ponder if I want to work on this.
> 
> A safer alternative to this optimization might be for the installer
> (or newfs) to consider the fs size when deciding on a default inode
> density.



Re: horribly slow fsck_ffs pass1 performance

2011-03-31 Thread Otto Moerbeek
On Thu, Mar 31, 2011 at 12:30:29PM +0200, Benny Lofgren wrote:

> For example, this is what one of my file systems looks like right now:
> 
> skynet:~# df -ih /u0
> Filesystem SizeUsed   Avail Capacity iused   ifree  %iused
> Mounted on
> /dev/raid1a   12.6T7.0T5.5T56%  881220 211866810 0%   /u0
> 
> This one takes about an hour to fsck.

The change discussed won't help you much here, since ffs2 filesytems
already only initializes inodeblocks actually used. 

Memory use will be reduced, however, which might be even more
worthwhile. 

-Otto



acpivideo: do not trust _DOD for brightness

2011-03-31 Thread Martynas Venckus
Please test, details below.

We attach acpivout(4) to every device enumerated in _DOD.  However,
if you read ACPI spec. closely, it says that _DOD (unlike _DOS) is
not required if the system supports LCD brightness control.

In my case the situation is even worse--_DOD enumerates only 0x400
(which is non-existent), however there's this DD03 device which is
LCD and perfecly can handle brightness control.

I suggest to trust _DOD less, and search for devices having _BCL,
_BCM, and _BQC instead;  because only such device would be able to
handle brightness control.

We use the same trick in other drivers (look for functions instead
of trusting enumeration crap in acpi).

The following diff fixes my problem (Toshiba L300):

 acpivideo0 at acpi0: OVGA
+acpivout0 at acpivideo0: DD03

 acpivar.h   |8 -
 acpivideo.c |   81

 acpivout.c  |   30 +-
 3 files changed, 20 insertions(+), 99 deletions(-)

Index: acpivar.h
===
RCS file: /cvs/src/sys/dev/acpi/acpivar.h,v
retrieving revision 1.69
diff -u -r1.69 acpivar.h
--- acpivar.h   2 Jan 2011 04:56:57 -   1.69
+++ acpivar.h   31 Mar 2011 07:57:59 -
@@ -49,9 +49,6 @@

struct acpi_softc *sc_acpi;
struct aml_node *sc_devnode;
-
-   int *sc_dod;
-   size_t  sc_dod_len;
 };

 struct acpi_attach_args {
@@ -61,11 +58,6 @@
void*aaa_table;
struct aml_node *aaa_node;
const char  *aaa_dev;
-};
-
-struct acpivideo_attach_args {
-   struct acpi_attach_args aaa;
-   int dod;
 };

 struct acpi_mem_map {
Index: acpivideo.c
===
RCS file: /cvs/src/sys/dev/acpi/acpivideo.c,v
retrieving revision 1.7
diff -u -r1.7 acpivideo.c
--- acpivideo.c 27 Jul 2010 06:12:50 -  1.7
+++ acpivideo.c 31 Mar 2011 07:57:59 -
@@ -54,7 +54,6 @@
 intacpivideo_notify(struct aml_node *, int, void *);

 void   acpivideo_set_policy(struct acpivideo_softc *, int);
-void   acpivideo_get_dod(struct acpivideo_softc *);
 intacpi_foundvout(struct aml_node *, void *);
 intacpivideo_print(void *, const char *);

@@ -101,8 +100,6 @@
acpivideo_set_policy(sc,
DOS_SWITCH_BY_OSPM | DOS_BRIGHTNESS_BY_OSPM);

-   acpivideo_get_dod(sc);
-   aml_find_node(aaa->aaa_node, "_DCS", acpi_foundvout, sc);
aml_find_node(aaa->aaa_node, "_BCL", acpi_foundvout, sc);
 }

@@ -137,7 +134,7 @@
args.type = AML_OBJTYPE_INTEGER;

aml_evalname(sc->sc_acpi, sc->sc_devnode, "_DOS", 1, &args, &res);
-   DPRINTF(("%s: set policy to %d", DEVNAME(sc), aml_val2int(&res)));
+   DPRINTF(("%s: set policy to %X\n", DEVNAME(sc), aml_val2int(&res)));

aml_freevalue(&res);
 }
@@ -145,45 +142,23 @@
 int
 acpi_foundvout(struct aml_node *node, void *arg)
 {
-   struct aml_valueres;
-   int i, addr;
-   charfattach = 0;
-
struct acpivideo_softc *sc = (struct acpivideo_softc *)arg;
struct device *self = (struct device *)arg;
-   struct acpivideo_attach_args av;
+   struct acpi_attach_args aaa;
+   node = node->parent;

-   if (sc->sc_dod == NULL)
-   return (0);
-   DPRINTF(("Inside acpi_foundvout()"));
-   if (aml_evalname(sc->sc_acpi, node->parent, "_ADR", 0, NULL, &res)) {
-   DPRINTF(("%s: no _ADR\n", DEVNAME(sc)));
+   DPRINTF(("Inside acpi_foundvout()\n"));
+   if (node->parent != sc->sc_devnode)
return (0);
-   }
-   addr = aml_val2int(&res);
-   DPRINTF(("_ADR: %X\n", addr));
-   aml_freevalue(&res);

-   for (i = 0; i < sc->sc_dod_len; i++)
-   if (addr == (sc->sc_dod[i]&0x)) {
-   DPRINTF(("Matched: %X\n", sc->sc_dod[i]));
-   fattach = 1;
-   break;
-   }
-   if (fattach) {
-   memset(&av, 0, sizeof(av));
-   av.aaa.aaa_iot = sc->sc_acpi->sc_iot;
-   av.aaa.aaa_memt = sc->sc_acpi->sc_memt;
-   av.aaa.aaa_node = node->parent;
-   av.aaa.aaa_name = "acpivout";
-   av.dod = sc->sc_dod[i];
-   /*
-*  Make sure we don't attach twice if both _BCL and
-* _DCS methods are found by zeroing the DOD address.
-*/
-   sc->sc_dod[i] = 0;
+   if (aml_searchname(node, "_BCM") && aml_searchname(node, "_BQC")) {
+   memset(&aaa, 0, sizeof(aaa));
+   aaa.aaa_iot = sc->sc_acpi->sc_iot;
+   aaa.aaa_memt = sc->sc_acpi->sc_memt;
+   aaa.aaa_node = node;
+   aaa.aaa_name = "acpivout";

-   config_found(self, &av, acpivideo_print);
+   config_found(self, &aaa, acpivideo_print);
}

return (0);
@@ -202,38 +177,6 @@
}

   

Re: horribly slow fsck_ffs pass1 performance

2011-03-31 Thread Stuart Henderson
On 2011/03/31 12:46, Otto Moerbeek wrote:
> > 
> > In general, the default values and algorithms for allocations could
> > probably do with a tune-up, since of course today's disks are several
> > magnitudes larger than only a few years ago (let alone than those that
> > were around when the bulk of the file system code was written!), and the
> > usage patterns are also in my experience often wildly different in a
> > large file system than in a smaller one.
> 
> We do that already, inode density will be lower for newly created
> partitions, because diskalbel sets larger block and fragment sizes.

Ah, the manual is out-of-date.

Index: newfs.8
===
RCS file: /cvs/src/sbin/newfs/newfs.8,v
retrieving revision 1.68
diff -u -p -r1.68 newfs.8
--- newfs.8 21 Mar 2010 07:51:23 -  1.68
+++ newfs.8 31 Mar 2011 11:10:18 -
@@ -169,7 +169,7 @@ The expected average file size for the f
 The expected average number of files per directory on the file system.
 .It Fl i Ar bytes
 This specifies the density of inodes in the file system.
-The default is to create an inode for each 8192 bytes of data space.
+The default is to create an inode for every 4 fragments.
 If fewer inodes are desired, a larger number should be used;
 to create more inodes a smaller number should be given.
 .It Fl m Ar free-space



Re: horribly slow fsck_ffs pass1 performance

2011-03-31 Thread Otto Moerbeek
On Thu, Mar 31, 2011 at 12:30:29PM +0200, Benny Lofgren wrote:

> On 2011-03-31 11.13, Stuart Henderson wrote:
> > On 2011-03-31, Otto Moerbeek  wrote:
> >> On Wed, Mar 30, 2011 at 03:45:02PM -0500, Amit Kulkarni wrote:
> >>> In fsck_ffs's pass1.c it just takes forever for large sized partitions
> >>> and also if you have very high number of files stored on that
> >>> partition (used inodes count goes high).
> > If you really have a lot of used inodes, skipping the unused ones
> > isn't going to help :-)
> > You could always build your large-sized filesystems with a larger
> > value of bytes-per-inode. newfs -i 32768 or 65536 is good for common
> > filesystem use patterns with larger partitions (for specialist uses
> > e.g. storing backups as huge single files it might be appropriate
> > to go even higher).
> > Of course this does involve dump/restore if you need to do this for
> > an existing filesystem.
> >> It is interesting because it really speeds up fsck_ffs for filesystems
> >> with few used inodes.
> >> There's also a dangerous part: it assumes the cylinder group summary
> >> info is ok when softdeps has been used. 
> >> I suppose that's the reason why it was never included into OpenBSD.
> >> I'll ponder if I want to work on this.
> > 
> > A safer alternative to this optimization might be for the installer
> > (or newfs) to consider the fs size when deciding on a default inode
> > density.
> 
> I think this is a very good idea regardless. I often forget to manually
> tune large file systems, and end up with some ridiculously skewed
> resource allocations.
> 
> For example, this is what one of my file systems looks like right now:
> 
> skynet:~# df -ih /u0
> Filesystem SizeUsed   Avail Capacity iused   ifree  %iused
> Mounted on
> /dev/raid1a   12.6T7.0T5.5T56%  881220 211866810 0%   /u0
> 
> This one takes about an hour to fsck.
> 
> In general, the default values and algorithms for allocations could
> probably do with a tune-up, since of course today's disks are several
> magnitudes larger than only a few years ago (let alone than those that
> were around when the bulk of the file system code was written!), and the
> usage patterns are also in my experience often wildly different in a
> large file system than in a smaller one.

We do that already, inode density will be lower for newly created
partitions, because diskalbel sets larger block and fragment sizes.

-Otto

> 
> I guess an fs like the one above would benefit a lot from the optimization
> the OP mentions.
> 
> Perhaps it could be optional, since Otto mentions that it makes
> assumptions on correctness of the cylinder group summary info. I haven't
> looked at the code in a while, so I can't really judge the consequences
> of that, or if some middle ground can be reached where the CG info is
> sanity checked without the need for a full scan through every inode.
> 
> 
> Regards,
> /Benny
> 
> -- 
> internetlabbet.se / work:   +46 8 551 124 80  / "Words must
> Benny Lvfgren/  mobile: +46 70 718 11 90 /   be weighed,
> /   fax:+46 8 551 124 89/not counted."
>/email:  benny -at- internetlabbet.se



Re: horribly slow fsck_ffs pass1 performance

2011-03-31 Thread Otto Moerbeek
On Thu, Mar 31, 2011 at 09:13:41AM +, Stuart Henderson wrote:

> On 2011-03-31, Otto Moerbeek  wrote:
> > On Wed, Mar 30, 2011 at 03:45:02PM -0500, Amit Kulkarni wrote:
> >> In fsck_ffs's pass1.c it just takes forever for large sized partitions
> >> and also if you have very high number of files stored on that
> >> partition (used inodes count goes high).
> 
> If you really have a lot of used inodes, skipping the unused ones
> isn't going to help :-)
> 
> You could always build your large-sized filesystems with a larger
> value of bytes-per-inode. newfs -i 32768 or 65536 is good for common
> filesystem use patterns with larger partitions (for specialist uses
> e.g. storing backups as huge single files it might be appropriate
> to go even higher).

disklabel has code already to move to larger block and frag sizes for
large (new) partitions. newfs picks these settings up.


> 
> Of course this does involve dump/restore if you need to do this for
> an existing filesystem.
> 
> > It is interesting because it really speeds up fsck_ffs for filesystems
> > with few used inodes.
> >
> > There's also a dangerous part: it assumes the cylinder group summary
> > info is ok when softdeps has been used. 
> >
> > I suppose that's the reason why it was never included into OpenBSD.
> >
> > I'll ponder if I want to work on this.
> 
> A safer alternative to this optimization might be for the installer
> (or newfs) to consider the fs size when deciding on a default inode
> density.

-Otto



Re: horribly slow fsck_ffs pass1 performance

2011-03-31 Thread Benny Lofgren
On 2011-03-31 11.13, Stuart Henderson wrote:
> On 2011-03-31, Otto Moerbeek  wrote:
>> On Wed, Mar 30, 2011 at 03:45:02PM -0500, Amit Kulkarni wrote:
>>> In fsck_ffs's pass1.c it just takes forever for large sized partitions
>>> and also if you have very high number of files stored on that
>>> partition (used inodes count goes high).
> If you really have a lot of used inodes, skipping the unused ones
> isn't going to help :-)
> You could always build your large-sized filesystems with a larger
> value of bytes-per-inode. newfs -i 32768 or 65536 is good for common
> filesystem use patterns with larger partitions (for specialist uses
> e.g. storing backups as huge single files it might be appropriate
> to go even higher).
> Of course this does involve dump/restore if you need to do this for
> an existing filesystem.
>> It is interesting because it really speeds up fsck_ffs for filesystems
>> with few used inodes.
>> There's also a dangerous part: it assumes the cylinder group summary
>> info is ok when softdeps has been used. 
>> I suppose that's the reason why it was never included into OpenBSD.
>> I'll ponder if I want to work on this.
> 
> A safer alternative to this optimization might be for the installer
> (or newfs) to consider the fs size when deciding on a default inode
> density.

I think this is a very good idea regardless. I often forget to manually
tune large file systems, and end up with some ridiculously skewed
resource allocations.

For example, this is what one of my file systems looks like right now:

skynet:~# df -ih /u0
Filesystem SizeUsed   Avail Capacity iused   ifree  %iused
Mounted on
/dev/raid1a   12.6T7.0T5.5T56%  881220 211866810 0%   /u0

This one takes about an hour to fsck.

In general, the default values and algorithms for allocations could
probably do with a tune-up, since of course today's disks are several
magnitudes larger than only a few years ago (let alone than those that
were around when the bulk of the file system code was written!), and the
usage patterns are also in my experience often wildly different in a
large file system than in a smaller one.

I guess an fs like the one above would benefit a lot from the optimization
the OP mentions.

Perhaps it could be optional, since Otto mentions that it makes
assumptions on correctness of the cylinder group summary info. I haven't
looked at the code in a while, so I can't really judge the consequences
of that, or if some middle ground can be reached where the CG info is
sanity checked without the need for a full scan through every inode.


Regards,
/Benny

-- 
internetlabbet.se / work:   +46 8 551 124 80  / "Words must
Benny Lvfgren/  mobile: +46 70 718 11 90 /   be weighed,
/   fax:+46 8 551 124 89/not counted."
   /email:  benny -at- internetlabbet.se



games/tetris: hide the cursor during game

2011-03-31 Thread David Coppa
Hi, 

Nice improvement for the best Tetris implementation ever made.
>From NetBSD.

OK? 

Index: screen.c
===
RCS file: /cvs/src/games/tetris/screen.c,v
retrieving revision 1.13
diff -u -p -r1.13 screen.c
--- screen.c20 Apr 2006 03:25:36 -  1.13
+++ screen.c31 Mar 2011 08:16:26 -
@@ -80,7 +80,9 @@ static char
*LLstr, /* last line, first column */
*pcstr, /* pad character */
*TEstr, /* end cursor motion mode */
-   *TIstr; /* begin cursor motion mode */
+   *TIstr, /* begin cursor motion mode */
+   *VIstr, /* make cursor invisible */
+   *VEstr; /* make cursor appear normal */
 char
*SEstr, /* end standout mode */
*SOstr; /* begin standout mode */
@@ -107,6 +109,8 @@ struct tcsinfo {/* termcap string info
{"so", &SOstr},
{"te", &TEstr},
{"ti", &TIstr},
+   {"vi", &VIstr},
+   {"ve", &VEstr},
{"up", &UP},/* cursor up */
{ {0}, NULL}
 };
@@ -291,6 +295,8 @@ scr_set(void)
 */
if (TIstr)
putstr(TIstr);  /* termcap(5) says this is not padded */
+   if (VIstr)
+   putstr(VIstr);  /* termcap(5) says this is not padded */
if (tstp != SIG_IGN)
(void) signal(SIGTSTP, scr_stop);
if (ttou != SIG_IGN)
@@ -321,6 +327,8 @@ scr_end(void)
/* exit screen mode */
if (TEstr)
putstr(TEstr);  /* termcap(5) says this is not padded */
+   if (VEstr)
+   putstr(VEstr);  /* termcap(5) says this is not padded */
(void) fflush(stdout);
(void) tcsetattr(0, TCSADRAIN, &oldtt);
isset = 0;



Re: horribly slow fsck_ffs pass1 performance

2011-03-31 Thread Stuart Henderson
On 2011-03-31, Otto Moerbeek  wrote:
> On Wed, Mar 30, 2011 at 03:45:02PM -0500, Amit Kulkarni wrote:
>> In fsck_ffs's pass1.c it just takes forever for large sized partitions
>> and also if you have very high number of files stored on that
>> partition (used inodes count goes high).

If you really have a lot of used inodes, skipping the unused ones
isn't going to help :-)

You could always build your large-sized filesystems with a larger
value of bytes-per-inode. newfs -i 32768 or 65536 is good for common
filesystem use patterns with larger partitions (for specialist uses
e.g. storing backups as huge single files it might be appropriate
to go even higher).

Of course this does involve dump/restore if you need to do this for
an existing filesystem.

> It is interesting because it really speeds up fsck_ffs for filesystems
> with few used inodes.
>
> There's also a dangerous part: it assumes the cylinder group summary
> info is ok when softdeps has been used. 
>
> I suppose that's the reason why it was never included into OpenBSD.
>
> I'll ponder if I want to work on this.

A safer alternative to this optimization might be for the installer
(or newfs) to consider the fs size when deciding on a default inode
density.



Re: horribly slow fsck_ffs pass1 performance

2011-03-31 Thread Otto Moerbeek
On Wed, Mar 30, 2011 at 03:45:02PM -0500, Amit Kulkarni wrote:

> Hi,
> 
> In fsck_ffs's pass1.c it just takes forever for large sized partitions
> and also if you have very high number of files stored on that
> partition (used inodes count goes high).
> 
> fsck main limitation is in pass1.c.
> 
> In pass1.c I found out that it in fact proceeded to check all inodes,
> but there's a misleading comment there, which says, "Find all
> allocated blocks". So the original intent was to check only used
> inodes in that code block but somebody deleted that part of code when
> compared to FreeBSD. Is there any special reason not to build a used
> inode list, then only go through it as FreeBSD does? I know they added
> some stuff in the last year but that part of code has existed for a
> long time and we don't have it. Why not?
> 
> I was reading cvs ver 1.46 of pass1.c in FreeBSD.
> 
> Thanks

AFAIK, we never had that optimization.

It is interesting because it really speeds up fsck_ffs for filesystems
with few used inodes.

There's also a dangerous part: it assumes the cylinder group summary
info is ok when softdeps has been used. 

I suppose that's the reason why it was never included into OpenBSD.

I'll ponder if I want to work on this.

-Otto