Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found

2013-07-07 Thread Andre Albsmeier
On Thu, 04-Jul-2013 at 08:15:50 +0200, Konstantin Belousov wrote:
> On Thu, Jul 04, 2013 at 07:27:00AM +0200, Andre Albsmeier wrote:
> > On Thu, 04-Jul-2013 at 07:24:40 +0200, Konstantin Belousov wrote:
> > > On Thu, Jul 04, 2013 at 07:14:09AM +0200, Andre Albsmeier wrote:
> > > > On Mon, 17-Jun-2013 at 21:30:31 +0200, John Baldwin wrote:
> > > > > On Sunday, June 16, 2013 2:39:42 am Andre Albsmeier wrote:
> > > > > > On Fri, 31-May-2013 at 16:51:03 +0200, John Baldwin wrote:
> > > > > > > On Friday, May 31, 2013 8:26:11 am Andre Albsmeier wrote:
> > > > > > > > Each day at 5:15 we are generating snapshots on various 
> > > > > > > > machines.
> > > > > > > > This used to work perfectly under 7-STABLE for years but since
> > > > > > > > we started to use 9.1-STABLE the machine reboots in about 10%
> > > > > > > > of all cases.
> > > > > > > > 
> > > > > > > > After rebooting we find a new snapshot file which is a bit
> > > > > > > > smaller than the good ones and with different permissions
> > > > > > > > It does not succeed a fsck. In this example it is the one
> > > > > > > > whose name is beginning with s3:
> > > > > > > > 
> > > > > > > > -r--r-   1 root  operator  snapshot 72802894528 29 May 
> > > > > > > > 05:15 s2-2013.05.28-03.15.04
> > > > > > > > -r   1 root  operator  snapshot 72802893824 29 May 
> > > > > > > > 05:15 s3-2013.05.29-03.15.03
> > > > > > > > -r--r-   1 root  operator  snapshot 72802894528 28 May 
> > > > > > > > 14:22 s4-2013.05.23-06.38.44
> > > > > > > > -r--r-   1 root  operator  snapshot 72802894528 28 May 
> > > > > > > > 14:22 s5-2013.05.24-03.15.03
> > > > > > > > -r--r-   1 root  operator  snapshot 72802894528 28 May 
> > > > > > > > 14:22 s6-2013.05.25-03.15.03
> > > > > > > > 
> > > > > > > > After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel
> > > > > > > > I see the following LORs (mksnap_ffs starts exactly at 5:15):
> > > > > > > > 
> > > > > > > > May 29 05:15:00  palveli kernel: lock order reversal:
> > > > > > > > May 29 05:15:00  palveli kernel: 1st 0xc2371da8 ufs 
> > > > > > > > (ufs) @ /src/src-9/sys/kern/vfs_mount.c:1240
> > > > > > > > May 29 05:15:00  palveli kernel: 2nd 0xc2371ec4 
> > > > > > > > devfs (devfs) @ /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414
> > > > > > > > May 29 05:15:04  palveli kernel: lock order reversal:
> > > > > > > > May 29 05:15:04  palveli kernel: 1st 0xc228471c 
> > > > > > > > snaplk (snaplk) @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976
> > > > > > > > May 29 05:15:04  palveli kernel: 2nd 0xc22f25e4 ufs 
> > > > > > > > (ufs) @ /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626
> > > > > > > > 
> > > > > > > > Unfortunatley no corefiles are being generated ;-(.
> > > > > > > > 
> > > > > > > > I have checked and even rebuilt the (UFS1) fs in question
> > > > > > > > from scratch. I have also seen this happen on an UFS2 on
> > > > > > > > another machine and on a third one when running "dump -L"
> > > > > > > > on a root fs.
> > > > > > > > 
> > > > > > > > Any hints of how to proceed?
> > > > > > > 
> > > > > > > Would it be possible to setup a serial console that is logged on 
> > > > > > > this machine
> > > > > > > to see if it is panic'ing but failing to write out a crashdump?
> > > > > > 
> > > > > > Couldn't attach the serial console yet ;-(. But I had people
> > > > > > attach a KVMoverIP switch and enabled the various KDB options
> > > > > > in the kernel. Now we can see a bit more (see below) -- no
> > > > > > crashdump is being generated though.
> > > > > 
> > > > > :(  Unfortunately these LORs don't really help with discerning the 
> > > > > cause of
> > > > > the reboot.  If you have remote power access (and still wanted to 
> > > > > test this)
> > > > > one option would be to change KDB to drop into the debugger on a 
> > > > > panic.
> > > > > Then you could connect over the KVM and take images of the original 
> > > > > panic
> > > > > along with a stack trace.
> > > > 
> > > > After a few days of no problems, the box decided to crash
> > > > during mksnap_ffs today ;-(. But now I have a crashdump,
> > > > see below. Unfortunatley, I cannot upload the dump somewhere
> > > > but if you ask me check whatever things I'll be happy to help.
> > > > 
> > > > kgdb /usr/obj/src/src-9/sys/palveli/kernel.debug vmcore.4
> > > > GNU gdb 6.1.1 [FreeBSD]
> > > > Copyright 2004 Free Software Foundation, Inc.
> > > > GDB is free software, covered by the GNU General Public License, and 
> > > > you are
> > > > welcome to change it and/or distribute copies of it under certain 
> > > > conditions.
> > > > Type "show copying" to see the conditions.
> > > > There is absolutely no warranty for GDB.  Type "show warranty" for 
> > > > details.
> > > > This GDB was configured as "i386-marcel-freebsd"...
> > > > 
> > > > Unread portion of the kernel message buffer:
> > > > 
> > > > 
> > > > Fatal trap 12: page fault while in kernel mode
> > > > fault virtual address   = 0xcfb5e000
> > > > fault code  = su

RE: USB ports on Lenovo T400 do not work after a suspend/resume

2013-07-07 Thread Hans Petter Selasky
Hi,

FYI: The USB stack will currently run a complete controller reset upon resume, 
like during boot.

--HPS 

 
-Original message-
> From:Ian Smith mailto:smi...@nimnet.asn.au> >
> Sent: Sunday 7th July 2013 7:52
> To: Adrian Chadd mailto:adr...@freebsd.org> >
> Cc: freebsd-a...@freebsd.org  ; 
> freebsd-stable@freebsd.org  ; 
> freebsd-...@freebsd.org  
> Subject: Re: USB ports on Lenovo T400 do not work after a suspend/resume
> 
> On Sun, 30 Jun 2013 15:02:57 -0700, Adrian Chadd wrote:
>  > On 30 June 2013 07:22, Ian Smith   > wrote:
> [..]
>  > > Nothing of note that I can see, if that usb hub-to-bus remapping is
>  > > normal.  As you said, 'CPU0: local APIC error 0x40' looks maybe sus.
>  > > Maybe someone who knows might comment on that?
> 
> Does noone know what that signifies?  Maybe it's not relevant to this.
> 
>  > > Just checking: you've tried other USB devices apart from uftdi0?
>  > 
>  > Yup, there's no 5v on the port.
> 
> I was rather taken aback to hear this.  Would not this indicate a 
> failure to reinitialise the basic underlying USB hardware on resume?
> 
> More than a bit bemused, Ian
> ___
> freebsd-a...@freebsd.org   mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-acpi 
>  
> To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org 
>  "
> 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


RE: XHCI umass support breaks between r248085 and r252560 on 9-STABLE

2013-07-07 Thread Hans Petter Selasky
Hi,

Check for CAM/SCSI related changes. There has not been so many USB changes 
recently. Possibly not USB related.

Thank you,

--HPS
 
-Original message-
> From:Alexandre Kovalenko mailto:bsd.gai...@gmail.com> >
> Sent: Thursday 4th July 2013 20:58
> To: freebsd-...@freebsd.org  
> Cc: freebsd-stable@freebsd.org  
> Subject: XHCI umass support breaks between r248085 and r252560 on 9-STABLE
> 
> Three different external hard drives (Seagate, Western Digital and noname USB 
> 3.0 enclosure) refused to be recognized as the umass devices. Reverting 
> /usr/src/sys/dev/bsd/controller to r248085, building and loading just xhci 
> module makes drives appear again. Below are snippets from the log in both 
> cases:
> 
> Non working:
> 
> Jul  4 14:35:17 twinhead kernel: xhci0:  
> mem 0xfddfe000-0xfddf irq 16 at device 0.0 on pci2
> Jul  4 14:35:17 twinhead kernel: xhci0: 64 byte context size.
> Jul  4 14:35:17 twinhead kernel: usbus0 on xhci0
> Jul  4 14:35:17 twinhead kernel: usbus0: 5.0Gbps Super Speed USB v3.0
> Jul  4 14:35:17 twinhead kernel: ugen0.1: <0x1912> at usbus0
> Jul  4 14:35:17 twinhead kernel: uhub0: <0x1912 XHCI root HUB, class 9/0, rev 
> 3.00/1.00, addr 1> on usbus0
> Jul  4 14:35:17 twinhead kernel: uhub0: 8 ports with 8 removable, self powered
> Jul  4 14:35:24 twinhead kernel: ugen0.2:  at usbus0
> Jul  4 14:35:24 twinhead kernel: umass0:  3.00/0.01, addr 1> on usbus0
> Jul  4 14:35:29 twinhead kernel: (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 12 
> 00 00 00 24 00 
> Jul  4 14:35:29 twinhead kernel: (probe0:umass-sim0:0:0:0): CAM status: CCB 
> request completed with an error
> Jul  4 14:35:29 twinhead kernel: (probe0:umass-sim0:0:0:0): Retrying command
> Jul  4 14:35:30 twinhead kernel: (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 12 
> 00 00 00 24 00 
> Jul  4 14:35:30 twinhead kernel: (probe0:umass-sim0:0:0:0): CAM status: CCB 
> request completed with an error
> Jul  4 14:35:30 twinhead kernel: (probe0:umass-sim0:0:0:0): Retrying command
> Jul  4 14:35:35 twinhead kernel: (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 12 
> 00 00 00 24 00 
> Jul  4 14:35:35 twinhead kernel: (probe0:umass-sim0:0:0:0): CAM status: CCB 
> request completed with an error
> Jul  4 14:35:35 twinhead kernel: (probe0:umass-sim0:0:0:0): Retrying command
> Jul  4 14:35:36 twinhead kernel: (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 12 
> 00 00 00 24 00 
> Jul  4 14:35:36 twinhead kernel: (probe0:umass-sim0:0:0:0): CAM status: CCB 
> request completed with an error
> Jul  4 14:35:36 twinhead kernel: (probe0:umass-sim0:0:0:0): Retrying command
> Jul  4 14:35:41 twinhead kernel: (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 12 
> 00 00 00 24 00 
> Jul  4 14:35:41 twinhead kernel: (probe0:umass-sim0:0:0:0): CAM status: CCB 
> request completed with an error
> Jul  4 14:35:41 twinhead kernel: (probe0:umass-sim0:0:0:0): Error 5, Retries 
> exhausted
> 
> Working:
> 
> Jul  4 14:40:20 twinhead kernel: ugen0.2:  at usbus0 (disconnected)
> Jul  4 14:40:20 twinhead kernel: umass0: at uhub0, port 2, addr 1 
> (disconnected)
> Jul  4 14:40:27 twinhead kernel: ugen0.2:  at usbus0
> Jul  4 14:40:27 twinhead kernel: umass0:  0/0, rev 3.00/0.01, addr 1> on usbus0
> Jul  4 14:40:27 twinhead kernel: (probe0:umass-sim0:0:0:0): REPORT LUNS. CDB: 
> a0 00 00 00 00 00 00 00 00 10 00 00 
> Jul  4 14:40:27 twinhead kernel: (probe0:umass-sim0:0:0:0): CAM status: SCSI 
> Status Error
> Jul  4 14:40:27 twinhead kernel: (probe0:umass-sim0:0:0:0): SCSI status: 
> Check Condition
> Jul  4 14:40:27 twinhead kernel: (probe0:umass-sim0:0:0:0): SCSI sense: 
> ILLEGAL REQUEST asc:20,0 (Invalid command operation code)
> Jul  4 14:40:27 twinhead kernel: (probe0:umass-sim0:0:0:0): Error 22, 
> Unretryable error
> Jul  4 14:40:27 twinhead kernel: da0 at umass-sim0 bus 0 scbus4 target 0 lun 0
> Jul  4 14:40:27 twinhead kernel: da0:  Fixed 
> Direct Access SCSI-5 device 
> Jul  4 14:40:27 twinhead kernel: da0: 400.000MB/s transfers
> Jul  4 14:40:27 twinhead kernel: da0: 190782MB (390721968 512 byte sectors: 
> 255H 63S/T 24321C)
> Jul  4 14:40:27 twinhead kernel: da0: quirks=0x2
> 
> I can provide additional information or try  patches as necessary.
> 
> Alexandre "Sunny" Kovalenko (Олександр Коваленко)
> 
> ___
> freebsd-...@freebsd.org   mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-usb 
>  
> To unsubscribe, send any mail to "freebsd-usb-unsubscr...@freebsd.org 
>  "

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found

2013-07-07 Thread Konstantin Belousov
On Sun, Jul 07, 2013 at 09:25:53AM +0200, Andre Albsmeier wrote:
> OK, here we go (looks better now):
> 
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "i386-marcel-freebsd"...
> 
> Unread portion of the kernel message buffer:
> dev = stripe/p, block = 592, fs = /palveli
> panic: ffs_blkfree_cg: freeing free block
> KDB: stack backtrace:
> db_trace_self_wrapper(c08207eb,d70fc924,c05fdfc9,c081df13,c08a82e0,...) at 
> db_trace_self_wrapper+0x26/frame 0xd70fc8f4
> kdb_backtrace(c081df13,c08a82e0,c0833a0b,d70fc930,d70fc930,...) at 
> kdb_backtrace+0x29/frame 0xd70fc900
> panic(c0833a0b,c2aae178,250,0,c2af80d4,...) at panic+0xc9/frame 0xd70fc924
> ffs_blkfree_cg(250,0,8000,49f,d70fcad0,...) at ffs_blkfree_cg+0x399/frame 
> 0xd70fc9c8
> ffs_blkfree(c2b35100,c2af8000,c2b0d470,250,0,...) at ffs_blkfree+0xad/frame 
> 0xd70fca00
> indir_trunc(fffa3ff4,,0,8000,0,...) at indir_trunc+0x658/frame 
> 0xd70fcae0
> indir_trunc(dff3,,c072df0a,c2d68d00,c087abd8,...) at 
> indir_trunc+0x514/frame 0xd70fcbc0
> handle_workitem_freeblocks(0,d70fcc4c,2,246,c2ab1000,...) at 
> handle_workitem_freeblocks+0x2dc/frame 0xd70fcc24
> process_worklist_item(0,0,0,c086ae78,0,...) at 
> process_worklist_item+0x27a/frame 0xd70fcc6c
> softdep_process_worklist(c2b36548,0,54,c0835825,64,...) at 
> softdep_process_worklist+0x91/frame 0xd70fcc9c
> softdep_flush(0,d70fcd08,0,c2aac2f0,0,...) at softdep_flush+0x3e4/frame 
> 0xd70f
> fork_exit(c0738bb0,0,d70fcd08) at fork_exit+0xa2/frame 0xd70fccf4
> fork_trampoline() at fork_trampoline+0x8/frame 0xd70fccf4
> --- trap 0, eip = 0, esp = 0xd70fcd40, ebp = 0 ---
> Uptime: 2d16h29m37s
> Physical memory: 503 MB
> Dumping 95 MB: 80 64 48 32 16
> 
> No symbol "stopped_cpus" in current context.
> No symbol "stoppcbs" in current context.
> #0  doadump (textdump=1) at pcpu.h:249
> 249 pcpu.h: No such file or directory.
> in pcpu.h
> (kgdb) where
> #0  doadump (textdump=1) at pcpu.h:249
> #1  0xc05f in kern_reboot (howto=260) at 
> /src/src-9/sys/kern/kern_shutdown.c:449
> #2  0xc05fe028 in panic (fmt=) at 
> /src/src-9/sys/kern/kern_shutdown.c:637
> #3  0xc0717899 in ffs_blkfree_cg (ump=0xc2b35100, fs=0xc2af8000, 
> devvp=0xc2b0d470, bno=592, 
> size=32768, inum=1183, dephd=0xd70fcad0) at 
> /src/src-9/sys/ufs/ffs/ffs_alloc.c:2151
> #4  0xc0717c8d in ffs_blkfree (ump=0xc2b35100, fs=0xc2af8000, 
> devvp=0xc2b0d470, bno=592, 
> size=32768, inum=1183, vtype=VREG, dephd=0xd70fcad0) at 
> /src/src-9/sys/ufs/ffs/ffs_alloc.c:2280
> #5  0xc0730348 in indir_trunc (freework=0xc2f99100, dbn=1642816, lbn=-376844)
> at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7965
> #6  0xc0730204 in indir_trunc (freework=0xc2f99100, dbn=1639680, lbn=-8205)
> at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7946
> #7  0xc07324bc in handle_workitem_freeblocks (freeblks=0xc2fc1e00, flags=512)
> at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7588
> #8  0xc0730dfa in process_worklist_item (mp=0xc2b36548, target=10, flags=512)
> at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1774
> #9  0xc07360c1 in softdep_process_worklist (mp=0xc2b36548, full=0)
> at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1558
> #10 0xc0738f94 in softdep_flush () at 
> /src/src-9/sys/ufs/ffs/ffs_softdep.c:1414
> #11 0xc05d1b82 in fork_exit (callout=0xc0738bb0 , arg=0x0, 
> frame=0xd70fcd08)
> at /src/src-9/sys/kern/kern_fork.c:988
> #12 0xc07ba904 in fork_trampoline () at 
> /src/src-9/sys/i386/i386/exception.s:279
> (kgdb) up 10
> #10 0xc0738f94 in softdep_flush () at 
> /src/src-9/sys/ufs/ffs/ffs_softdep.c:1414
> 1414progress += softdep_process_worklist(mp, 0);
> 
>   -Andre

This looks unrelated, and exactly this panic is usually has one of two
causes:
- corrupted filesystem, run fsck to recheck it;
- faulty hardware, most likely RAM, but might be CPU/CPU cache/bus.

Is it the same machine where the bcopy panic occured ?


pgp3UzKr2vglG.pgp
Description: PGP signature


Re: status of autotuning freebsd for 9.2

2013-07-07 Thread Andre Oppermann

On 07.07.2013 08:32, Alfred Perlstein wrote:

Andre,

Are you going to have time to MFC things from -current for auto-tuning -stable 
before 9.2?


I simply ran out of time on Friday and MFCing such a big change requires
more testing.


I fear (maybe unnecessarily?) that we are about to ship yet another release 
that can't do basic
10gigE when sufficient memory exists.


There was some debate with myself whether such a behavior changing MFC
would be appropriate for a mid-stream stable release.  I guess yes, though
a number of people who currently set the parameters manually would have
to remove their tuning settings.


If you don't have time, then let me know and I'll see what I can do.


Can you help me with with testing?

--
Andre

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: make buildworld is now 50% slower

2013-07-07 Thread Daniel Braniss
> On Fri, Jul 05, 2013 at 02:39:00PM +0200, Dimitry Andric wrote:
> > [redirecting to the correct mailing list, freebsd-stable@ ...]
> > 
> > On Jul 5, 2013, at 10:53, Daniel Braniss  wrote:
> > > after today's update of 9.1-STABLE I noticed that make 
> > > build[world|kernel] are
> > > taking conciderable more time, is it because the upgrade of clang?
> > > and if so, is the code produced any better?
> > > 
> > > before:
> > > buildwordl:26m4.52s real 2h28m32.12s user 36m6.27s sys
> > > buildkernel:   7m29.42s real 23m22.22s user 4m26.26s sys
> > > 
> > > today:
> > > buildwordl:   34m29.80s real 2h38m9.37s user 37m7.61s sys
> > > buildkernel:15m31.52s real 22m59.40s user 4m33.06s sys
> > 
> > Ehm, your user and sys times are not that much different at all, they
> > add up to about 5% slower for buildworld, and 1% faster for build kernel.
> > Are you sure nothing else is running on that machine, eating up CPU time
> > while you are building? :)
> > 
> > But yes, clang 3.3 is of course somewhat larger than 3.2.  You might
> > especially notice that, if you are using gcc, which is very slow at
> > compiling C++.
> > 
> > In any case, if you do not care about clang, just set WITHOUT_CLANG= in
> > your /etc/src.conf, and you can shave off some build time.
> 
> I just built world/kernel (stable/9 r252769) 5 hours ago.  Results:
> 
> time make -j4 buildworld  = roughly 21 minutes on my hardware
> time make -j4 buildkernel = roughly 8 minutes on my hardware
> 

It's been a long time since I saw such numbers, maybe it's time
to see where time is being spent, I will run it without clang to compare with
your numbers.

> These numbers are about the norm for me, meaning I do not see a
> substantial increase in build times.
> 
> Key point: I do not use/build/grok clang, i.e. WITHOUT_CLANG=true is in
> my src.conf.  But I am aware of the big clang change in r252723.
> 
> If hardware details are wanted, ask, but I don't think it's relevant to
> what the root cause is.
> 

from what you are saying, I guess clang is not responsible.
looking for my Sherlock Holmes hat.
thanks,
danny

> -- 
> | Jeremy Chadwick   j...@koitsu.org |
> | UNIX Systems Administratorhttp://jdc.koitsu.org/ |
> | Making life hard for others since 1977. PGP 4BD6C0CB |
> 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Fwd: ixgbe Jumbo race condition leading to Deadlock

2013-07-07 Thread Kaushal Bhandankar
In 82599, for a Jumbo packet of 9.5 K ( which consumes 5 descriptors of
2048 bytes each ), when does the Descriptor write back happen ? Does it
happen per Descriptor or once per aggregated Descriptors ? Is it possible
that all descriptors except last one to be written back and when you read
RDH register, I get the last pending descriptor waiting inside 82599.
We are using srrctl |= IXGBE_SRRCTL_DESCTYPE_ADV_ONEBUF;

In my setup, I am seeing that, I don't see EOP set even when I read 5
descriptors. Checking DD will return me an incomplete packet. What should I
do in such a case ?

References from Data sheet:
-> Checking through DD bits eliminates a potential race condition: all
descriptor data is posted internally prior to incrementing the head
register and a read of the head register could potentially pass the
descriptor waiting inside the 82599.

Regards,
Kaushal
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: make buildworld is now 50% slower

2013-07-07 Thread Jeremy Chadwick
On Sun, Jul 07, 2013 at 11:50:29AM +0300, Daniel Braniss wrote:
> > On Fri, Jul 05, 2013 at 02:39:00PM +0200, Dimitry Andric wrote:
> > > [redirecting to the correct mailing list, freebsd-stable@ ...]
> > > 
> > > On Jul 5, 2013, at 10:53, Daniel Braniss  wrote:
> > > > after today's update of 9.1-STABLE I noticed that make 
> > > > build[world|kernel] are
> > > > taking conciderable more time, is it because the upgrade of clang?
> > > > and if so, is the code produced any better?
> > > > 
> > > > before:
> > > > buildwordl:  26m4.52s real 2h28m32.12s user 36m6.27s sys
> > > > buildkernel: 7m29.42s real 23m22.22s user 4m26.26s sys
> > > > 
> > > > today:
> > > > buildwordl: 34m29.80s real 2h38m9.37s user 37m7.61s sys
> > > > buildkernel:15m31.52s real 22m59.40s user 4m33.06s sys
> > > 
> > > Ehm, your user and sys times are not that much different at all, they
> > > add up to about 5% slower for buildworld, and 1% faster for build kernel.
> > > Are you sure nothing else is running on that machine, eating up CPU time
> > > while you are building? :)
> > > 
> > > But yes, clang 3.3 is of course somewhat larger than 3.2.  You might
> > > especially notice that, if you are using gcc, which is very slow at
> > > compiling C++.
> > > 
> > > In any case, if you do not care about clang, just set WITHOUT_CLANG= in
> > > your /etc/src.conf, and you can shave off some build time.
> > 
> > I just built world/kernel (stable/9 r252769) 5 hours ago.  Results:
> > 
> > time make -j4 buildworld  = roughly 21 minutes on my hardware
> > time make -j4 buildkernel = roughly 8 minutes on my hardware
> > 
> 
> It's been a long time since I saw such numbers, maybe it's time
> to see where time is being spent, I will run it without clang to compare with
> your numbers.
> 
> > These numbers are about the norm for me, meaning I do not see a
> > substantial increase in build times.
> > 
> > Key point: I do not use/build/grok clang, i.e. WITHOUT_CLANG=true is in
> > my src.conf.  But I am aware of the big clang change in r252723.
> > 
> > If hardware details are wanted, ask, but I don't think it's relevant to
> > what the root cause is.
> > 
> 
> from what you are saying, I guess clang is not responsible.
> looking for my Sherlock Holmes hat.

Some points to those numbers I stated above:

- System is an Intel Q9550 with 8GB of RAM

- Single SSD (UFS2+SU+TRIM) is used for root, /usr, /var, /tmp, and swap

- /usr/src is on ZFS (raidz1 + 3 disks) -- however I got equally small
numbers when it was on the SSD

- /usr/src is using compression=lz4  (to folks from -fs: yeah, I'm
trying it out to see how much of an impact it has on interactivity.  I
can still tell when it kicks in, but it's way, way better than lzjb.
Rather not get into that here)

- Contents of /etc/src.conf (to give you some idea of what I disable):

WITHOUT_ATM=true
WITHOUT_BLUETOOTH=true
WITHOUT_CLANG=true
WITHOUT_FLOPPY=true
WITHOUT_FREEBSD_UPDATE=true
WITHOUT_INET6=true
WITHOUT_IPFILTER=true
WITHOUT_IPX=true
WITHOUT_KERBEROS=true
WITHOUT_LIB32=true
WITHOUT_LPR=true
WITHOUT_NDIS=true
WITHOUT_NETGRAPH=true
WITHOUT_PAM_SUPPORT=true
WITHOUT_PPP=true
WITHOUT_SENDMAIL=true
WITHOUT_WIRELESS=true
WITH_OPENSSH_NONE_CIPHER=true

It's WITHOUT_CLANG that cuts down the buildworld time by a *huge* amount
(I remember when it got introduced, my buildworld jumped up to something
like 40 minutes); the rest probably save a minute or two at most.

- /etc/make.conf doesn't contain much that's relevant, other than:

CPUTYPE?=core2

# For DTrace; also affects ports
STRIP=
CFLAGS+=-fno-omit-frame-pointer

- I do some tweaks in /etc/sysctl.conf (mainly vfs.read_min and
vfs.read_max), but I will admit I am not completely sure what those
do quite yet (I just saw the commit from scottl@ a while back talking
about how an increased vfs.read_min helps them at Netflix quite a
lot).  I also adjust kern.maxvnodes.

- Some ZFS ARC settings are adjusted in /boot/loader.conf (I'm playing
with some stuff I read in Andriy Gapon's ZFS PDF), but they definitely
do not have a major impact on the numbers I listed off.

- I do increase kern.maxdsiz, kern.dfldsiz, and kern.maxssiz in
/boot/loader.conf to 2560M/2560M/256M respectively, but that was mainly
from the days when I ran MySQL and needed a huge userland processes.

All in all my numbers are low/small because of two things: the SSD, and
WITHOUT_CLANG.

Hope this gives you somewhere to start/stuff to ponder.

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: USB ports on Lenovo T400 do not work after a suspend/resume

2013-07-07 Thread Jeremy Chadwick
On Sun, Jul 07, 2013 at 03:51:12PM +1000, Ian Smith wrote:
> On Sun, 30 Jun 2013 15:02:57 -0700, Adrian Chadd wrote:
>  > On 30 June 2013 07:22, Ian Smith  wrote:
> [..]
>  > > Nothing of note that I can see, if that usb hub-to-bus remapping is
>  > > normal.  As you said, 'CPU0: local APIC error 0x40' looks maybe sus.
>  > > Maybe someone who knows might comment on that?
> 
> Does noone know what that signifies?  Maybe it's not relevant to this.

It's too vague to know.  The error comes from lapic_handle_error(),
which is a generic/small routine which pulls the local APIC error status
register.  (Note I'm saying APIC, not ACPI -- two different things)

apic_vector.S sets this up/makes use of this function, and its done as
an interrupt handler.

I think this is one of those situations where you have to know *what* is
being set up/done at that moment in time for the error code to mean
something.  Maybe booting verbose would give more information as to what
was being done that lead up to the line.

I've CC'd John Baldwin who might have some ideas.

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: make buildworld is now 50% slower

2013-07-07 Thread Matthew D. Fuller
Apropos of nothing, but...

On Sun, Jul 07, 2013 at 03:17:14AM -0700 I heard the voice of
Jeremy Chadwick, and lo! it spake thus:
>
> WITHOUT_LIB32=true

suggests you're running amd64, which I'm pretty sure means

> - I do increase kern.maxdsiz, kern.dfldsiz, and kern.maxssiz in
> /boot/loader.conf to 2560M/2560M/256M respectively, but that was mainly
> from the days when I ran MySQL and needed a huge userland processes.

are not necessarily _in_creases, and may well be mostly _de_creases.
e.g., on a RELENG_9 box with 8 gig of physical RAM:

% sysctl kern.{max{d,s},dfld}siz
kern.maxdsiz: 34359738368
kern.maxssiz: 536870912
kern.dfldsiz: 134217728

while a -CURRENT box with 16 has dfldsiz blown all the way up too.  I
don't recall doing anything to change them at all recently, and a
glance over loader.conf, sysctl.conf, rc.local, and the kernel configs
doesn't turn up anything.


-- 
Matthew Fuller (MF4839)   |  fulle...@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
   On the Internet, nobody can hear you scream.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: make buildworld is now 50% slower

2013-07-07 Thread Jeremy Chadwick
On Sun, Jul 07, 2013 at 05:47:31AM -0500, Matthew D. Fuller wrote:
> Apropos of nothing, but...
> 
> On Sun, Jul 07, 2013 at 03:17:14AM -0700 I heard the voice of
> Jeremy Chadwick, and lo! it spake thus:
> >
> > WITHOUT_LIB32=true
> 
> suggests you're running amd64, which I'm pretty sure means
> 
> > - I do increase kern.maxdsiz, kern.dfldsiz, and kern.maxssiz in
> > /boot/loader.conf to 2560M/2560M/256M respectively, but that was mainly
> > from the days when I ran MySQL and needed a huge userland processes.
> 
> are not necessarily _in_creases, and may well be mostly _de_creases.
> e.g., on a RELENG_9 box with 8 gig of physical RAM:
> 
> % sysctl kern.{max{d,s},dfld}siz
> kern.maxdsiz: 34359738368
> kern.maxssiz: 536870912
> kern.dfldsiz: 134217728
>
> while a -CURRENT box with 16 has dfldsiz blown all the way up too.  I
> don't recall doing anything to change them at all recently, and a
> glance over loader.conf, sysctl.conf, rc.local, and the kernel configs
> doesn't turn up anything.

Thanks!

The settings I mention are from "ancient times" -- specifically RELENG_6
on i386 (I know because I found an old mailing list post of mine
discussing the settings with a user).

The problem as I said was that mysqld would crap itself (crash and be
quite loud about it) if the process allocated too much memory/became too
large.  I am fairly certain the issue related to the data size, **not**
the stack size (but I didn't see the harm in increasing that either).

It's good to know I can remove these on amd64.  Yay, one less thing in
loader.conf I have to deal with...  :-)  Thanks again!

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found

2013-07-07 Thread Andre Albsmeier
On Sun, 07-Jul-2013 at 09:41:12 +0200, Konstantin Belousov wrote:
> On Sun, Jul 07, 2013 at 09:25:53AM +0200, Andre Albsmeier wrote:
> > OK, here we go (looks better now):
> > 
> > GNU gdb 6.1.1 [FreeBSD]
> > Copyright 2004 Free Software Foundation, Inc.
> > GDB is free software, covered by the GNU General Public License, and you are
> > welcome to change it and/or distribute copies of it under certain 
> > conditions.
> > Type "show copying" to see the conditions.
> > There is absolutely no warranty for GDB.  Type "show warranty" for details.
> > This GDB was configured as "i386-marcel-freebsd"...
> > 
> > Unread portion of the kernel message buffer:
> > dev = stripe/p, block = 592, fs = /palveli
> > panic: ffs_blkfree_cg: freeing free block
> > KDB: stack backtrace:
> > db_trace_self_wrapper(c08207eb,d70fc924,c05fdfc9,c081df13,c08a82e0,...) at 
> > db_trace_self_wrapper+0x26/frame 0xd70fc8f4
> > kdb_backtrace(c081df13,c08a82e0,c0833a0b,d70fc930,d70fc930,...) at 
> > kdb_backtrace+0x29/frame 0xd70fc900
> > panic(c0833a0b,c2aae178,250,0,c2af80d4,...) at panic+0xc9/frame 0xd70fc924
> > ffs_blkfree_cg(250,0,8000,49f,d70fcad0,...) at ffs_blkfree_cg+0x399/frame 
> > 0xd70fc9c8
> > ffs_blkfree(c2b35100,c2af8000,c2b0d470,250,0,...) at ffs_blkfree+0xad/frame 
> > 0xd70fca00
> > indir_trunc(fffa3ff4,,0,8000,0,...) at indir_trunc+0x658/frame 
> > 0xd70fcae0
> > indir_trunc(dff3,,c072df0a,c2d68d00,c087abd8,...) at 
> > indir_trunc+0x514/frame 0xd70fcbc0
> > handle_workitem_freeblocks(0,d70fcc4c,2,246,c2ab1000,...) at 
> > handle_workitem_freeblocks+0x2dc/frame 0xd70fcc24
> > process_worklist_item(0,0,0,c086ae78,0,...) at 
> > process_worklist_item+0x27a/frame 0xd70fcc6c
> > softdep_process_worklist(c2b36548,0,54,c0835825,64,...) at 
> > softdep_process_worklist+0x91/frame 0xd70fcc9c
> > softdep_flush(0,d70fcd08,0,c2aac2f0,0,...) at softdep_flush+0x3e4/frame 
> > 0xd70f
> > fork_exit(c0738bb0,0,d70fcd08) at fork_exit+0xa2/frame 0xd70fccf4
> > fork_trampoline() at fork_trampoline+0x8/frame 0xd70fccf4
> > --- trap 0, eip = 0, esp = 0xd70fcd40, ebp = 0 ---
> > Uptime: 2d16h29m37s
> > Physical memory: 503 MB
> > Dumping 95 MB: 80 64 48 32 16
> > 
> > No symbol "stopped_cpus" in current context.
> > No symbol "stoppcbs" in current context.
> > #0  doadump (textdump=1) at pcpu.h:249
> > 249 pcpu.h: No such file or directory.
> > in pcpu.h
> > (kgdb) where
> > #0  doadump (textdump=1) at pcpu.h:249
> > #1  0xc05f in kern_reboot (howto=260) at 
> > /src/src-9/sys/kern/kern_shutdown.c:449
> > #2  0xc05fe028 in panic (fmt=) at 
> > /src/src-9/sys/kern/kern_shutdown.c:637
> > #3  0xc0717899 in ffs_blkfree_cg (ump=0xc2b35100, fs=0xc2af8000, 
> > devvp=0xc2b0d470, bno=592, 
> > size=32768, inum=1183, dephd=0xd70fcad0) at 
> > /src/src-9/sys/ufs/ffs/ffs_alloc.c:2151
> > #4  0xc0717c8d in ffs_blkfree (ump=0xc2b35100, fs=0xc2af8000, 
> > devvp=0xc2b0d470, bno=592, 
> > size=32768, inum=1183, vtype=VREG, dephd=0xd70fcad0) at 
> > /src/src-9/sys/ufs/ffs/ffs_alloc.c:2280
> > #5  0xc0730348 in indir_trunc (freework=0xc2f99100, dbn=1642816, 
> > lbn=-376844)
> > at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7965
> > #6  0xc0730204 in indir_trunc (freework=0xc2f99100, dbn=1639680, lbn=-8205)
> > at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7946
> > #7  0xc07324bc in handle_workitem_freeblocks (freeblks=0xc2fc1e00, 
> > flags=512)
> > at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7588
> > #8  0xc0730dfa in process_worklist_item (mp=0xc2b36548, target=10, 
> > flags=512)
> > at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1774
> > #9  0xc07360c1 in softdep_process_worklist (mp=0xc2b36548, full=0)
> > at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1558
> > #10 0xc0738f94 in softdep_flush () at 
> > /src/src-9/sys/ufs/ffs/ffs_softdep.c:1414
> > #11 0xc05d1b82 in fork_exit (callout=0xc0738bb0 , arg=0x0, 
> > frame=0xd70fcd08)
> > at /src/src-9/sys/kern/kern_fork.c:988
> > #12 0xc07ba904 in fork_trampoline () at 
> > /src/src-9/sys/i386/i386/exception.s:279
> > (kgdb) up 10
> > #10 0xc0738f94 in softdep_flush () at 
> > /src/src-9/sys/ufs/ffs/ffs_softdep.c:1414
> > 1414progress += softdep_process_worklist(mp, 0);
> > 
> > -Andre
> 
> This looks unrelated, and exactly this panic is usually has one of two
> causes:
> - corrupted filesystem, run fsck to recheck it;

root@palveli:~>fsck /dev/stripe/p 
** /dev/stripe/p
** Last Mounted on /palveli
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
9895 files, 2039706 used, 15697693 free (5397 frags, 1961537 blocks, 0.0% 
fragmentation)

* FILE SYSTEM IS CLEAN *

> - faulty hardware, most likely RAM, but might be CPU/CPU cache/bus.

Well, of course I cannot prove that this is not the case.
But the box runs flawlessly otherwise. RAM is ECC monitored,
PSU is OK and airflow is OK. Sure, I can't look inside of
CPU et

Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found

2013-07-07 Thread Jeremy Chadwick
On Sun, Jul 07, 2013 at 02:13:54PM +0200, Andre Albsmeier wrote:
> On Sun, 07-Jul-2013 at 09:41:12 +0200, Konstantin Belousov wrote:
> > On Sun, Jul 07, 2013 at 09:25:53AM +0200, Andre Albsmeier wrote:
> > > OK, here we go (looks better now):
> > > 
> > > GNU gdb 6.1.1 [FreeBSD]
> > > Copyright 2004 Free Software Foundation, Inc.
> > > GDB is free software, covered by the GNU General Public License, and you 
> > > are
> > > welcome to change it and/or distribute copies of it under certain 
> > > conditions.
> > > Type "show copying" to see the conditions.
> > > There is absolutely no warranty for GDB.  Type "show warranty" for 
> > > details.
> > > This GDB was configured as "i386-marcel-freebsd"...
> > > 
> > > Unread portion of the kernel message buffer:
> > > dev = stripe/p, block = 592, fs = /palveli
> > > panic: ffs_blkfree_cg: freeing free block
> > > KDB: stack backtrace:
> > > db_trace_self_wrapper(c08207eb,d70fc924,c05fdfc9,c081df13,c08a82e0,...) 
> > > at db_trace_self_wrapper+0x26/frame 0xd70fc8f4
> > > kdb_backtrace(c081df13,c08a82e0,c0833a0b,d70fc930,d70fc930,...) at 
> > > kdb_backtrace+0x29/frame 0xd70fc900
> > > panic(c0833a0b,c2aae178,250,0,c2af80d4,...) at panic+0xc9/frame 0xd70fc924
> > > ffs_blkfree_cg(250,0,8000,49f,d70fcad0,...) at ffs_blkfree_cg+0x399/frame 
> > > 0xd70fc9c8
> > > ffs_blkfree(c2b35100,c2af8000,c2b0d470,250,0,...) at 
> > > ffs_blkfree+0xad/frame 0xd70fca00
> > > indir_trunc(fffa3ff4,,0,8000,0,...) at indir_trunc+0x658/frame 
> > > 0xd70fcae0
> > > indir_trunc(dff3,,c072df0a,c2d68d00,c087abd8,...) at 
> > > indir_trunc+0x514/frame 0xd70fcbc0
> > > handle_workitem_freeblocks(0,d70fcc4c,2,246,c2ab1000,...) at 
> > > handle_workitem_freeblocks+0x2dc/frame 0xd70fcc24
> > > process_worklist_item(0,0,0,c086ae78,0,...) at 
> > > process_worklist_item+0x27a/frame 0xd70fcc6c
> > > softdep_process_worklist(c2b36548,0,54,c0835825,64,...) at 
> > > softdep_process_worklist+0x91/frame 0xd70fcc9c
> > > softdep_flush(0,d70fcd08,0,c2aac2f0,0,...) at softdep_flush+0x3e4/frame 
> > > 0xd70f
> > > fork_exit(c0738bb0,0,d70fcd08) at fork_exit+0xa2/frame 0xd70fccf4
> > > fork_trampoline() at fork_trampoline+0x8/frame 0xd70fccf4
> > > --- trap 0, eip = 0, esp = 0xd70fcd40, ebp = 0 ---
> > > Uptime: 2d16h29m37s
> > > Physical memory: 503 MB
> > > Dumping 95 MB: 80 64 48 32 16
> > > 
> > > No symbol "stopped_cpus" in current context.
> > > No symbol "stoppcbs" in current context.
> > > #0  doadump (textdump=1) at pcpu.h:249
> > > 249 pcpu.h: No such file or directory.
> > > in pcpu.h
> > > (kgdb) where
> > > #0  doadump (textdump=1) at pcpu.h:249
> > > #1  0xc05f in kern_reboot (howto=260) at 
> > > /src/src-9/sys/kern/kern_shutdown.c:449
> > > #2  0xc05fe028 in panic (fmt=) at 
> > > /src/src-9/sys/kern/kern_shutdown.c:637
> > > #3  0xc0717899 in ffs_blkfree_cg (ump=0xc2b35100, fs=0xc2af8000, 
> > > devvp=0xc2b0d470, bno=592, 
> > > size=32768, inum=1183, dephd=0xd70fcad0) at 
> > > /src/src-9/sys/ufs/ffs/ffs_alloc.c:2151
> > > #4  0xc0717c8d in ffs_blkfree (ump=0xc2b35100, fs=0xc2af8000, 
> > > devvp=0xc2b0d470, bno=592, 
> > > size=32768, inum=1183, vtype=VREG, dephd=0xd70fcad0) at 
> > > /src/src-9/sys/ufs/ffs/ffs_alloc.c:2280
> > > #5  0xc0730348 in indir_trunc (freework=0xc2f99100, dbn=1642816, 
> > > lbn=-376844)
> > > at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7965
> > > #6  0xc0730204 in indir_trunc (freework=0xc2f99100, dbn=1639680, 
> > > lbn=-8205)
> > > at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7946
> > > #7  0xc07324bc in handle_workitem_freeblocks (freeblks=0xc2fc1e00, 
> > > flags=512)
> > > at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7588
> > > #8  0xc0730dfa in process_worklist_item (mp=0xc2b36548, target=10, 
> > > flags=512)
> > > at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1774
> > > #9  0xc07360c1 in softdep_process_worklist (mp=0xc2b36548, full=0)
> > > at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1558
> > > #10 0xc0738f94 in softdep_flush () at 
> > > /src/src-9/sys/ufs/ffs/ffs_softdep.c:1414
> > > #11 0xc05d1b82 in fork_exit (callout=0xc0738bb0 , arg=0x0, 
> > > frame=0xd70fcd08)
> > > at /src/src-9/sys/kern/kern_fork.c:988
> > > #12 0xc07ba904 in fork_trampoline () at 
> > > /src/src-9/sys/i386/i386/exception.s:279
> > > (kgdb) up 10
> > > #10 0xc0738f94 in softdep_flush () at 
> > > /src/src-9/sys/ufs/ffs/ffs_softdep.c:1414
> > > 1414progress += softdep_process_worklist(mp, 
> > > 0);
> > > 
> > >   -Andre
> > 
> > This looks unrelated, and exactly this panic is usually has one of two
> > causes:
> > - corrupted filesystem, run fsck to recheck it;
> 
> root@palveli:~>fsck /dev/stripe/p 
> ** /dev/stripe/p
> ** Last Mounted on /palveli
> ** Phase 1 - Check Blocks and Sizes
> ** Phase 2 - Check Pathnames
> ** Phase 3 - Check Connectivity
> ** Phase 4 - Check Reference Counts
> ** Phase 5 - Check Cyl groups
> 9895 files, 2039706 used, 15697693 free (5397 frags, 1961537 blocks, 0.0% 
> fra

Re: USB ports on Lenovo T400 do not work after a suspend/resume

2013-07-07 Thread Ian Smith
On Sun, 7 Jul 2013 03:26:24 -0700, Jeremy Chadwick wrote:
 > On Sun, Jul 07, 2013 at 03:51:12PM +1000, Ian Smith wrote:
 > > On Sun, 30 Jun 2013 15:02:57 -0700, Adrian Chadd wrote:
 > >  > On 30 June 2013 07:22, Ian Smith  wrote:
 > > [..]
 > >  > > Nothing of note that I can see, if that usb hub-to-bus remapping is
 > >  > > normal.  As you said, 'CPU0: local APIC error 0x40' looks maybe sus.
 > >  > > Maybe someone who knows might comment on that?
 > > 
 > > Does noone know what that signifies?  Maybe it's not relevant to this.
 > 
 > It's too vague to know.  The error comes from lapic_handle_error(),
 > which is a generic/small routine which pulls the local APIC error status
 > register.  (Note I'm saying APIC, not ACPI -- two different things)

Indeed; I've been familiar with PICs since c.'79.  Googling to check 
what the 'A' stood for I found this .. from '97 but usefully descriptive 
perhaps: http://people.freebsd.org/~fsmp/SMP/papers/apicsubsystem.txt

I also found this from March 2011 involving Mike Tancsa, you and jhb@ :)
http://freebsd.1045724.n5.nabble.com/CPU0-local-APIC-error-0x40-CPU1-local-APIC-error-0x40-td3961805.html

 > apic_vector.S sets this up/makes use of this function, and its done as
 > an interrupt handler.

Whether an (unserviced?) interrupt error is related to Adrian's symptom 
- apparent total failure of USB reinitialisation on resume, but only if 
no USB devices exist in the external slots - remains to be seen.  hps@ 
has just confirmed that it should work the same as on boot, but then 
this error was flagged on boot - perhaps it also manifests on resume?

 > I think this is one of those situations where you have to know *what* is
 > being set up/done at that moment in time for the error code to mean
 > something.  Maybe booting verbose would give more information as to what
 > was being done that lead up to the line.
 > 
 > I've CC'd John Baldwin who might have some ideas.

Thanks.  We have verbose dmesg already.  Thread starts (in -stable) at
http://lists.freebsd.org/pipermail/freebsd-stable/2013-June/073917.html
and amidst some wild goose chases, pointer to verbose dmesg etc is at
http://lists.freebsd.org/pipermail/freebsd-stable/2013-June/074018.html

cheers, Ian
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: USB ports on Lenovo T400 do not work after a suspend/resume

2013-07-07 Thread Adrian Chadd
I don't think it's a USB controller issue.

Those ports are connected to USB hubs, right? I wonder if there's some
ACPI nonsense that's resulting in the hubs not being powered up on
resume.



-adrian

On 7 July 2013 00:32, Hans Petter Selasky
 wrote:
> Hi,
>
> FYI: The USB stack will currently run a complete controller reset upon
> resume, like during boot.
>
> --HPS
>
>
>
> -Original message-
>> From:Ian Smith 
>> Sent: Sunday 7th July 2013 7:52
>> To: Adrian Chadd 
>> Cc: freebsd-a...@freebsd.org; freebsd-stable@freebsd.org;
>> freebsd-...@freebsd.org
>> Subject: Re: USB ports on Lenovo T400 do not work after a suspend/resume
>>
>> On Sun, 30 Jun 2013 15:02:57 -0700, Adrian Chadd wrote:
>>  > On 30 June 2013 07:22, Ian Smith  wrote:
>> [..]
>>  > > Nothing of note that I can see, if that usb hub-to-bus remapping is
>>  > > normal.  As you said, 'CPU0: local APIC error 0x40' looks maybe sus.
>>  > > Maybe someone who knows might comment on that?
>>
>> Does noone know what that signifies?  Maybe it's not relevant to this.
>>
>>  > > Just checking: you've tried other USB devices apart from uftdi0?
>>  >
>>  > Yup, there's no 5v on the port.
>>
>> I was rather taken aback to hear this.  Would not this indicate a
>> failure to reinitialise the basic underlying USB hardware on resume?
>>
>> More than a bit bemused, Ian
>> ___
>> freebsd-a...@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-acpi
>> To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
>>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ixgbe Jumbo race condition leading to Deadlock

2013-07-07 Thread Jack Vogel
The "potential race condition" as the data sheet puts it, is only when you
are
trying to manage your RX ring by reading the RDH register, this is a bad
idea
anyway, none of our (Intel) drivers do this. Using the DD bit is what you
want
to do. The DD bit is set when the descriptor is written back, and that
happens
when the DMA is complete.

The packet is incomplete until the descriptor with EOP set, in my code an
mbuf chain is created, and as each new descriptor is processed the pointer
to the head of the whole chain is kept in rxbuf->fmp, thus when you get to
the EOP descriptor you will be ready to send the whole chain to the stack.

Its good that you are using ONEBUF since packet split has hardware issues
on 82599.

Are you developing a new driver, or simply having issues using mine?

Regards,

Jack



On Sun, Jul 7, 2013 at 2:24 AM, Kaushal Bhandankar wrote:

> In 82599, for a Jumbo packet of 9.5 K ( which consumes 5 descriptors of
> 2048 bytes each ), when does the Descriptor write back happen ? Does it
> happen per Descriptor or once per aggregated Descriptors ? Is it possible
> that all descriptors except last one to be written back and when you read
> RDH register, I get the last pending descriptor waiting inside 82599.
> We are using srrctl |= IXGBE_SRRCTL_DESCTYPE_ADV_ONEBUF;
>
> In my setup, I am seeing that, I don't see EOP set even when I read 5
> descriptors. Checking DD will return me an incomplete packet. What should I
> do in such a case ?
>
> References from Data sheet:
> -> Checking through DD bits eliminates a potential race condition: all
> descriptor data is posted internally prior to incrementing the head
> register and a read of the head register could potentially pass the
> descriptor waiting inside the 82599.
>
> Regards,
> Kaushal
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ixgbe Jumbo race condition leading to Deadlock

2013-07-07 Thread Kaushal Bhandankar
Hi Jack,
Thanks for the explanation. Do you suggest that I keep reading rx
descriptor with DD bit and keep them pending till I get the descriptor with
EOP set ? How much max delay can be expected for the EOP descriptor to be
written back ?

Regards,
Kaushal


On Sun, Jul 7, 2013 at 10:40 PM, Jack Vogel  wrote:

> The "potential race condition" as the data sheet puts it, is only when you
> are
> trying to manage your RX ring by reading the RDH register, this is a bad
> idea
> anyway, none of our (Intel) drivers do this. Using the DD bit is what you
> want
> to do. The DD bit is set when the descriptor is written back, and that
> happens
> when the DMA is complete.
>
> The packet is incomplete until the descriptor with EOP set, in my code an
> mbuf chain is created, and as each new descriptor is processed the pointer
> to the head of the whole chain is kept in rxbuf->fmp, thus when you get to
> the EOP descriptor you will be ready to send the whole chain to the stack.
>
> Its good that you are using ONEBUF since packet split has hardware issues
> on 82599.
>
> Are you developing a new driver, or simply having issues using mine?
>
> Regards,
>
> Jack
>
>
>
> On Sun, Jul 7, 2013 at 2:24 AM, Kaushal Bhandankar 
> wrote:
>
>> In 82599, for a Jumbo packet of 9.5 K ( which consumes 5 descriptors of
>> 2048 bytes each ), when does the Descriptor write back happen ? Does it
>> happen per Descriptor or once per aggregated Descriptors ? Is it possible
>> that all descriptors except last one to be written back and when you read
>> RDH register, I get the last pending descriptor waiting inside 82599.
>> We are using srrctl |= IXGBE_SRRCTL_DESCTYPE_ADV_ONEBUF;
>>
>> In my setup, I am seeing that, I don't see EOP set even when I read 5
>> descriptors. Checking DD will return me an incomplete packet. What should
>> I
>> do in such a case ?
>>
>> References from Data sheet:
>> -> Checking through DD bits eliminates a potential race condition: all
>> descriptor data is posted internally prior to incrementing the head
>> register and a read of the head register could potentially pass the
>> descriptor waiting inside the 82599.
>>
>> Regards,
>> Kaushal
>> ___
>> freebsd-stable@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>>
>
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ixgbe Jumbo race condition leading to Deadlock

2013-07-07 Thread Kaushal Bhandankar
The solution is:

I have a function which pre-calculates the buffers required for processing
the packet. This is to eliminate any lack-of-memory errors during
processing.
In this function I loop over descriptors from next_to_check onwards. If I
loop over some descriptors with DD set and do not see the EOP set till I
reach 5 descriptors, I will return a failure and not change next_to_check
etc.
So that on the next read, or later hopefully descriptor with EOP would have
been written back. This will ensure that race condition does not happen.

Let me know if it sounds good.

Regards,
Kaushal


On Sun, Jul 7, 2013 at 10:43 PM, Kaushal Bhandankar wrote:

> Hi Jack,
> Thanks for the explanation. Do you suggest that I keep reading rx
> descriptor with DD bit and keep them pending till I get the descriptor with
> EOP set ? How much max delay can be expected for the EOP descriptor to be
> written back ?
>
> Regards,
> Kaushal
>
>
> On Sun, Jul 7, 2013 at 10:40 PM, Jack Vogel  wrote:
>
>> The "potential race condition" as the data sheet puts it, is only when
>> you are
>> trying to manage your RX ring by reading the RDH register, this is a bad
>> idea
>> anyway, none of our (Intel) drivers do this. Using the DD bit is what you
>> want
>> to do. The DD bit is set when the descriptor is written back, and that
>> happens
>> when the DMA is complete.
>>
>> The packet is incomplete until the descriptor with EOP set, in my code an
>> mbuf chain is created, and as each new descriptor is processed the pointer
>> to the head of the whole chain is kept in rxbuf->fmp, thus when you get to
>> the EOP descriptor you will be ready to send the whole chain to the stack.
>>
>> Its good that you are using ONEBUF since packet split has hardware issues
>> on 82599.
>>
>> Are you developing a new driver, or simply having issues using mine?
>>
>> Regards,
>>
>> Jack
>>
>>
>>
>> On Sun, Jul 7, 2013 at 2:24 AM, Kaushal Bhandankar 
>> wrote:
>>
>>> In 82599, for a Jumbo packet of 9.5 K ( which consumes 5 descriptors of
>>> 2048 bytes each ), when does the Descriptor write back happen ? Does it
>>> happen per Descriptor or once per aggregated Descriptors ? Is it possible
>>> that all descriptors except last one to be written back and when you read
>>> RDH register, I get the last pending descriptor waiting inside 82599.
>>> We are using srrctl |= IXGBE_SRRCTL_DESCTYPE_ADV_ONEBUF;
>>>
>>> In my setup, I am seeing that, I don't see EOP set even when I read 5
>>> descriptors. Checking DD will return me an incomplete packet. What
>>> should I
>>> do in such a case ?
>>>
>>> References from Data sheet:
>>> -> Checking through DD bits eliminates a potential race condition: all
>>> descriptor data is posted internally prior to incrementing the head
>>> register and a read of the head register could potentially pass the
>>> descriptor waiting inside the 82599.
>>>
>>> Regards,
>>> Kaushal
>>> ___
>>> freebsd-stable@freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org
>>> "
>>>
>>
>>
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


RE: USB ports on Lenovo T400 do not work after a suspend/resume

2013-07-07 Thread Hans Petter Selasky
Hi,

The USB code should re-attach the uhub driver to the root HUB and any other 
HUBs after resume. Part of the attach code is to set the  power on.

See /sys/dev/usb/usb_hub.c

And:

grep -r UHF_PORT_POWER /sys/dev/usb/

--HPS
 
 
-Original message-
> From:Adrian Chadd mailto:adr...@freebsd.org> >
> Sent: Sunday 7th July 2013 18:43
> To: Hans Petter Selasky   >
> Cc: freebsd-a...@freebsd.org  ; 
> freebsd-stable@freebsd.org  ; Ian Smith 
> mailto:smi...@nimnet.asn.au> >; 
> freebsd-...@freebsd.org  
> Subject: Re: USB ports on Lenovo T400 do not work after a suspend/resume
> 
> I don't think it's a USB controller issue.
> 
> Those ports are connected to USB hubs, right? I wonder if there's some
> ACPI nonsense that's resulting in the hubs not being powered up on
> resume.
> 
> 
> 
> -adrian
> 
> On 7 July 2013 00:32, Hans Petter Selasky
> mailto:hans.petter.sela...@bitfrost.no> > 
> wrote:
> > Hi,
> >
> > FYI: The USB stack will currently run a complete controller reset upon
> > resume, like during boot.
> >
> > --HPS
> >
> >
> >
> > -Original message-
> >> From:Ian Smith mailto:smi...@nimnet.asn.au> >
> >> Sent: Sunday 7th July 2013 7:52
> >> To: Adrian Chadd mailto:adr...@freebsd.org> >
> >> Cc: freebsd-a...@freebsd.org  ; 
> >> freebsd-stable@freebsd.org  ;
> >> freebsd-...@freebsd.org  
> >> Subject: Re: USB ports on Lenovo T400 do not work after a suspend/resume
> >>
> >> On Sun, 30 Jun 2013 15:02:57 -0700, Adrian Chadd wrote:
> >>  > On 30 June 2013 07:22, Ian Smith  >>  > wrote:
> >> [..]
> >>  > > Nothing of note that I can see, if that usb hub-to-bus remapping is
> >>  > > normal.  As you said, 'CPU0: local APIC error 0x40' looks maybe sus.
> >>  > > Maybe someone who knows might comment on that?
> >>
> >> Does noone know what that signifies?  Maybe it's not relevant to this.
> >>
> >>  > > Just checking: you've tried other USB devices apart from uftdi0?
> >>  >
> >>  > Yup, there's no 5v on the port.
> >>
> >> I was rather taken aback to hear this.  Would not this indicate a
> >> failure to reinitialise the basic underlying USB hardware on resume?
> >>
> >> More than a bit bemused, Ian
> >> ___
> >> freebsd-a...@freebsd.org   mailing list
> >> http://lists.freebsd.org/mailman/listinfo/freebsd-acpi 
> >>  
> >> To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org 
> >>  "
> >>
> ___
> freebsd-a...@freebsd.org   mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-acpi 
>  
> To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org 
>  "
> 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: status of autotuning freebsd for 9.2

2013-07-07 Thread Alfred Perlstein

On 7/7/13 1:34 AM, Andre Oppermann wrote:

On 07.07.2013 08:32, Alfred Perlstein wrote:

Andre,

Are you going to have time to MFC things from -current for 
auto-tuning -stable before 9.2?


I simply ran out of time on Friday and MFCing such a big change requires
more testing.

I fear (maybe unnecessarily?) that we are about to ship yet another 
release that can't do basic

10gigE when sufficient memory exists.


There was some debate with myself whether such a behavior changing MFC
would be appropriate for a mid-stream stable release.  I guess yes, 
though

a number of people who currently set the parameters manually would have
to remove their tuning settings.


If you don't have time, then let me know and I'll see what I can do.


Can you help me with with testing?

Yes.  Please give me your proposed changes and I'll stand up a machine 
and give feedback.


-Alfred
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: XHCI umass support breaks between r248085 and r252560 on 9-STABLE

2013-07-07 Thread Alexandre Kovalenko
I do apologize for the typo below, which made my message unclear: I meant to 
say that I have reverted _/usr/src/sys/dev/usb/controller_ directory, 
specifically the following files:

root@twinhead:/usr/src/sys/dev/usb/controller # svn diff -r252560 | grep Index:
Index: xhci_pci.c
Index: ohci_pci.c
Index: xhci.c
Index: usb_controller.c
Index: xhcireg.h
root@twinhead:/usr/src/sys/dev/usb/controller # 

which (I think) are USB related and not CAM related. Please, let me know if I 
am wrong.

SIde question (I have been off the lists for a while): is it now considered 
polite to top-post? It was frowned upon way back when… if it still is not, I do 
apologize, but I can see no good way to fix it at this point.


Alexandre "Sunny" Kovalenko (Олександр Коваленко)


On Jul 7, 2013, at 3:36 AM, Hans Petter Selasky 
 wrote:

> Hi,
> 
> Check for CAM/SCSI related changes. There has not been so many USB changes 
> recently. Possibly not USB related.
> 
> Thank you,
> 
> --HPS
>  
> -Original message-
> > From:Alexandre Kovalenko 
> > Sent: Thursday 4th July 2013 20:58
> > To: freebsd-...@freebsd.org
> > Cc: freebsd-stable@freebsd.org
> > Subject: XHCI umass support breaks between r248085 and r252560 on 9-STABLE
> > 
> > Three different external hard drives (Seagate, Western Digital and noname 
> > USB 3.0 enclosure) refused to be recognized as the umass devices. Reverting 
> > /usr/src/sys/dev/bsd/controller to r248085, building and loading just xhci 
> > module makes drives appear again. Below are snippets from the log in both 
> > cases:
> > 
> > Non working:
> > 
> > Jul  4 14:35:17 twinhead kernel: xhci0:  
> > mem 0xfddfe000-0xfddf irq 16 at device 0.0 on pci2
> > Jul  4 14:35:17 twinhead kernel: xhci0: 64 byte context size.
> > Jul  4 14:35:17 twinhead kernel: usbus0 on xhci0
> > Jul  4 14:35:17 twinhead kernel: usbus0: 5.0Gbps Super Speed USB v3.0
> > Jul  4 14:35:17 twinhead kernel: ugen0.1: <0x1912> at usbus0
> > Jul  4 14:35:17 twinhead kernel: uhub0: <0x1912 XHCI root HUB, class 9/0, 
> > rev 3.00/1.00, addr 1> on usbus0
> > Jul  4 14:35:17 twinhead kernel: uhub0: 8 ports with 8 removable, self 
> > powered
> > Jul  4 14:35:24 twinhead kernel: ugen0.2:  at usbus0
> > Jul  4 14:35:24 twinhead kernel: umass0:  > 3.00/0.01, addr 1> on usbus0
> > Jul  4 14:35:29 twinhead kernel: (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 
> > 12 00 00 00 24 00 
> > Jul  4 14:35:29 twinhead kernel: (probe0:umass-sim0:0:0:0): CAM status: CCB 
> > request completed with an error
> > Jul  4 14:35:29 twinhead kernel: (probe0:umass-sim0:0:0:0): Retrying command
> > Jul  4 14:35:30 twinhead kernel: (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 
> > 12 00 00 00 24 00 
> > Jul  4 14:35:30 twinhead kernel: (probe0:umass-sim0:0:0:0): CAM status: CCB 
> > request completed with an error
> > Jul  4 14:35:30 twinhead kernel: (probe0:umass-sim0:0:0:0): Retrying command
> > Jul  4 14:35:35 twinhead kernel: (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 
> > 12 00 00 00 24 00 
> > Jul  4 14:35:35 twinhead kernel: (probe0:umass-sim0:0:0:0): CAM status: CCB 
> > request completed with an error
> > Jul  4 14:35:35 twinhead kernel: (probe0:umass-sim0:0:0:0): Retrying command
> > Jul  4 14:35:36 twinhead kernel: (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 
> > 12 00 00 00 24 00 
> > Jul  4 14:35:36 twinhead kernel: (probe0:umass-sim0:0:0:0): CAM status: CCB 
> > request completed with an error
> > Jul  4 14:35:36 twinhead kernel: (probe0:umass-sim0:0:0:0): Retrying command
> > Jul  4 14:35:41 twinhead kernel: (probe0:umass-sim0:0:0:0): INQUIRY. CDB: 
> > 12 00 00 00 24 00 
> > Jul  4 14:35:41 twinhead kernel: (probe0:umass-sim0:0:0:0): CAM status: CCB 
> > request completed with an error
> > Jul  4 14:35:41 twinhead kernel: (probe0:umass-sim0:0:0:0): Error 5, 
> > Retries exhausted
> > 
> > Working:
> > 
> > Jul  4 14:40:20 twinhead kernel: ugen0.2:  at usbus0 (disconnected)
> > Jul  4 14:40:20 twinhead kernel: umass0: at uhub0, port 2, addr 1 
> > (disconnected)
> > Jul  4 14:40:27 twinhead kernel: ugen0.2:  at usbus0
> > Jul  4 14:40:27 twinhead kernel: umass0:  > class 0/0, rev 3.00/0.01, addr 1> on usbus0
> > Jul  4 14:40:27 twinhead kernel: (probe0:umass-sim0:0:0:0): REPORT LUNS. 
> > CDB: a0 00 00 00 00 00 00 00 00 10 00 00 
> > Jul  4 14:40:27 twinhead kernel: (probe0:umass-sim0:0:0:0): CAM status: 
> > SCSI Status Error
> > Jul  4 14:40:27 twinhead kernel: (probe0:umass-sim0:0:0:0): SCSI status: 
> > Check Condition
> > Jul  4 14:40:27 twinhead kernel: (probe0:umass-sim0:0:0:0): SCSI sense: 
> > ILLEGAL REQUEST asc:20,0 (Invalid command operation code)
> > Jul  4 14:40:27 twinhead kernel: (probe0:umass-sim0:0:0:0): Error 22, 
> > Unretryable error
> > Jul  4 14:40:27 twinhead kernel: da0 at umass-sim0 bus 0 scbus4 target 0 
> > lun 0
> > Jul  4 14:40:27 twinhead kernel: da0:  Fixed 
> > Direct Access SCSI-5 device 
> > Jul  4 14:40:27 twinhead kernel: da0: 400.000MB/s transfers
> > Jul  4 14:40:27 twinhead kernel: da0: 190782MB (390721968 512 by

Re: USB ports on Lenovo T400 do not work after a suspend/resume

2013-07-07 Thread Lars Engels
On Sun, Jun 30, 2013 at 03:02:57PM -0700, Adrian Chadd wrote:
> On 30 June 2013 07:22, Ian Smith  wrote:
> 
> > After removing [numbers] (for WITNESS?), diff started making sense.
> > The below is between the first and second suspend/resume cycles in
> > dmesg-3.txt, encompassing the others.
> 
> Cool!
> 
> > Nothing of note that I can see, if that usb hub-to-bus remapping is
> > normal.  As you said, 'CPU0: local APIC error 0x40' looks maybe sus.
> > Maybe someone who knows might comment on that?
> >
> > Just checking: you've tried other USB devices apart from uftdi0?
> 
> Yup, there's no 5v on the port.

Oh, BTW: can you check if you have power on the ports after the first
resume and no power after all next resumes until you reboot your
notebook?
That's the situation I had and maybe it can lead to something. ;)


pgponhzdZpTA9.pgp
Description: PGP signature


Shutdown hangs on unmount of a gjournaled file system in 8-Stable

2013-07-07 Thread Andreas Longwitz
The problem occurs after an update of 8-stable from r248120 to r252111.
Sometimes shutdown hangs:

Waiting (max 60 seconds) for system process `vnlru' to stop...done
Waiting (max 60 seconds) for system process `bufdaemon' to stop...done
Waiting (max 60 seconds) for system process `syncer' to stop...
Syncing disks, vnodes remaining...0 0 done
All buffers synced.

>From the kernel dump I see the deadlock occurs on unmount of a
gjournaled file system. Involved are two processes

db> ps
pid ppid pgrp uid state wmesg  wchan  cmd
  1   0   1   0  SLs  mount dr 0xff007f7e559c [init]
 18   0   0   0  SL   suspwt   0xff007f7e5364 [g_journal switcher]

(kgdb) info threads
 158 Thread 12 (PID=1: init)  sched_switch (td=0xff000235e8e0,
  newtd=,
flags=) at /usr/src/sys/kern/sched_ule.c:1932
 
 217 Thread 100076 (PID=18: g_journal switcher)  sched_switch

(td=0xff0002bd6000,
newtd=, flags=) at
/usr/src/sys/kern/sched_ule.c:1932


(kgdb) thread 158
[Switching to thread 158 (Thread 12)]#0
sched_switche(td=0xff000235e8e0,
newtd=, flags=) at
/usr/src/sys/kern/sched_ule.c:1932
1932cpuid = PCPU_GET(cpuid);

(kgdb) bt
#0  sched_switch (td=0xff000235e8e0, newtd=,
  flags=)
  at /usr/src/sys/kern/sched_ule.c:1932
#1  0x80407836 in mi_switch (flags=260, newtd=0x0) at

  /usr/src/sys/kern/kern_synch.c:466
#2  0x8043e0e2 in sleepq_wait (wchan=0xff007f7e559c, pri=80)
  at /usr/src/sys/kern/subr_sleepqueue.c:613
#3  0x80407fc6 in _sleep (ident=0xff007f7e559c,
  lock=0xff007f7e52f0,
  priority=,
  wmesg=0x8069f595 "mount drain", timo=0)
  at /usr/src/sys/kern/kern_synch.c:250
#4  0x8048ee42 in dounmount (mp=0xff007f7e52f0,
  flags=524288, td=)
  at /usr/src/sys/kern/vfs_mount.c:1266
#5  0x80493202 in vfs_unmountall () at
  /usr/src/sys/kern/vfs_subr.c:3321
#6  0x803fec69 in boot (howto=) at
  /usr/src/sys/kern/kern_shutdown.c:428
#7  0x803fef86 in reboot (td=,
  uap=0xff8000238bb0)
  at /usr/src/sys/kern/kern_shutdown.c:191
#8  0x805db1b4 in amd64_syscall (td=0xff000235e8e0,
  traced=0) at subr_syscall.c:114
#9  0x805c282c in Xfast_syscall () at
 /usr/src/sys/amd64/amd64/exception.S:387

(kgdb) f 5
#5  0x80493202 in vfs_unmountall () at
  /usr/src/sys/kern/vfs_subr.c:3321
3321error = dounmount(mp, MNT_FORCE, td);

(kgdb) p mp->mnt_lockref
$1=1

(kgdb) f 4
#4  0x8048ee42 in dounmount (mp=0xff007f7e52f0,
 flags=524288, td=)
 at /usr/src/sys/kern/vfs_mount.c:1266
1266error = msleep(&mp->mnt_lockref, MNT_MTX(mp), PVFS,

(kgdb) list
1261if (flags & MNT_FORCE)
1262 mp->mnt_kern_flag |= MNTK_UNMOUNTF;
1263error = 0;
1264if (mp->mnt_lockref) {
1265 mp->mnt_kern_flag |= MNTK_DRAINING;
1266 error = msleep(&mp->mnt_lockref, MNT_MTX(mp), PVFS,
1267"mount drain", 0);
1268}
1269MNT_IUNLOCK(mp);
1270KASSERT(mp->mnt_lockref == 0,

(kgdb) thread 217
[Switching to thread 217 (Thread 100076)]#0  sched_switch
  (td=0xff0002bd6000,
   newtd=,
   flags=) at
   /usr/src/sys/kern/sched_ule.c:1932
1932cpuid = PCPU_GET(cpuid);

(kgdb) bt
#0  sched_switch (td=0xff0002bd6000, newtd=,
   flags=)
   at /usr/src/sys/kern/sched_ule.c:1932
#1  0x80407836 in mi_switch (flags=260, newtd=0x0) at
   /usr/src/sys/kern/kern_synch.c:466
#2  0x8043e0e2 in sleepq_wait
   (wchan=0xff007f7e5364, pri=159)
   at /usr/src/sys/kern/subr_sleepqueue.c:613
#3  0x80407fc6 in _sleep (ident=0xff007f7e5364,
   lock=0xff007f7e52f0,
   priority=,
   wmesg=0x806a0813 "suspwt", timo=0)
   at /usr/src/sys/kern/kern_synch.c:250
#4  0x804a25f0 in vfs_write_suspend (mp=0xff007f7e52f0) at
   /usr/src/sys/kern/vfs_vnops.c:1277
#5  0x80c843bd in g_journal_switcher
   (arg=) at
   /usr/src/sys/modules/geom/geom_journal/../
../../geom/journal/g_journal.c:2968
#6  0x803d326f in fork_exit (callout=0x80c838e0
   , arg=0x80c8b140,
   frame=0xff8242e68c40) at
   /usr/src/sys/kern/kern_fork.c:872
#7  0x805c2a0e in fork_trampoline () at
   /usr/src/sys/amd64/amd64/exception.S:602

(kgdb) f 4
#4  0x804a25f0 in vfs_write_suspend (mp=0

Re: XHCI umass support breaks between r248085 and r252560 on 9-STABLE

2013-07-07 Thread Scot Hetzel
On Sun, Jul 7, 2013 at 3:09 PM, Alexandre Kovalenko
 wrote:
>
> SIde question (I have been off the lists for a while): is it now considered 
> polite to top-post? It was frowned upon way back when… if it still is not, I 
> do apologize, but I can see no good way to fix it at this point.
>
>

Side Answer:  I believe that it preferred that you don't top-post on
the lists.  To get around GMail's top-post issue:

- just delete the first 2 lines in the reply
- scroll thru the quoted message deleting what isn't relevant to your reply
- inline your responses to the relevant parts of the message

-- 
DISCLAIMER:

No electrons were maimed while sending this message. Only slightly bruised.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Sanity Check on Mac Mini

2013-07-07 Thread Doug Hardie
As I previously indicated, I have tested a couple more Minis and updated the 
instructions with what I learned.  Here is the revised version:

2.12Installing FreeBSD on an Apple Mac Mini

The Mac Mini is an attractive server platform.  Its small, runs cool, low 
powered, and reasonably cheap.  There a variety of configurations available.  
However, the bottom of the line seems to be a powerful server.

There are a few issues with installing FreeBSD on the mini.  Mostly they derive 
from the newer hardware it uses and that it uses EFI rather than a BIOS for 
booting.  There is not a simple install that will get the unit working, but the 
additional steps required are quite simple.  The goal of these instructions is 
to get FreeBSD 9.1-Release running as a headless server on a Late 2012 Mini, 
Model No A1347.  Its probably possible to setup the mini as a workstation, but 
that would require some additional effort to test the display and mouse 
interfaces and find fixes for any issues with those.

The original intent was to have the server without system source so that it 
could be maintained using freebsd-update.  However, that will probably have to 
wait until 9.2-Release is available.  In the meantime, freebsd-update has to be 
used with care since I believe it will replace the modified bge files.


2.12.1  Preparing for the Install

2.12.1.1Automatic Startup after Power is Restored

Generally servers need to be automatically restarted after a power failure.  
Start up the Mini in OS-X.  If this is a new unit, I go through the 
registration so that Apple has it on record for use with AppleCare.  Go to 
System Preferences and select Energy Saver.  I set Put hard disk to sleep when 
possible, Wake for network access, Allow power button to put the computer to 
sleep, and most importantly - Start up automatically after a power failure.  
Note, shutting down the computer at this time will not permit it to come back 
on when power is applied.  You have to pull the power plug.  Apparently this 
setting is a bit mislabeled.  Its more like Return the Power to the last status.

These settings work properly with Mac OS-X.  I have not found a way to set the 
startup settings while running FreeBSD yet.  These settings do carry over to 
the FreeBSD install.  However, you may need to lock the energy saver 
preferences for that to happen.

Shutdown the Mini.


2.12.1.2Preparing FreeBSD for the installation

You can select either the i386 or the amd64 distributions.  Both have been 
tested with these procedures and yield a working server.  The bottom of the 
line mini comes with 4 GB of memory installed.  The i386 distribution will only 
use 2 GB.  The remainder will not be used.  The amd64 distribution builds 
larger binary modules, but it will use all the memory.

Download the 9.1 Release distribution Memstick Image.  You will need to copy 
that to a memstick.  There are instructions in section 2.3.5 for copying the 
image to the memstick.  Obtain a display and USB keyboard and connect them to 
the mini.

With a browser go to svnweb.freebsd.org/base/head/sys/dev.  Click on the bge 
folder.  Click on the name if_bge.c.  Find Revision 245931.  Click on the 
download link and save the file.

Go back to the bge page and click on if_bgereg.h.  Find Revision 243686. Click 
on the download link and save the file.  Edit the saved if_bgereg.h file and 
add the following to the end:

#define PCIER_DEVICE_CAP0x4
#define PCIER_DEVICE_CTL0x8
#define PCIEM_CAP_MAX_PAYLOAD   0x0007
#define PCIEM_CTL_RELAXED_ORD_ENABLE0x0010
#define PCIEM_CTL_NOSNOOP_ENABLE0x0800
#define PCIER_DEVICE_STA0xa
#define PCIEM_STA_CORRECTABLE_ERROR 0x0001
#define PCIEM_STA_NON_FATAL_ERROR   0x0002
#define PCIEM_STA_FATAL_ERROR   0x0004
#define PCIEM_STA_UNSUPPORTED_REQ   0x0008

There was a change to some of the names in if_bgereg.h after the 9.1 Release 
was created, but before the corrections to the bge driver were included.  It 
would be possible to grab the appropriate earlier verion of if_bgereg.h, 
however, when rebuilding the kernel, there are other drivers that use the new 
names.  This seems to be the easiest approach.  Also, it worked.

Go back to the dev page and click on the mii folder.  Click on brgphy.c.  Find 
revision 244482.  Click on the download link and save the file.

Copy the saved files to another memstick.


2.12.2  Installing the 9.1 Release

Boot the mini using the memstick.  Hold down the Option key on the keyboard and 
power up the mini.  You will hear the hardware check beep and shortly 
thereafter the screen will show one or more boot icons.  Double click on the 
one named "Windows".  It will have a USB icon.

Continue through the normal installation procedure as detailed earlier in this 
chapter.  If you are building a FreeBSD only server, use the entire disk.  
Also, be sure to install the system source.  You will need it later.

You will ne

Re: USB ports on Lenovo T400 do not work after a suspend/resume

2013-07-07 Thread Adrian Chadd
Nope, no power after first resume if i have nothing plugged in.

Why?



-adrian

On 7 July 2013 13:49, Lars Engels  wrote:
> On Sun, Jun 30, 2013 at 03:02:57PM -0700, Adrian Chadd wrote:
>> On 30 June 2013 07:22, Ian Smith  wrote:
>>
>> > After removing [numbers] (for WITNESS?), diff started making sense.
>> > The below is between the first and second suspend/resume cycles in
>> > dmesg-3.txt, encompassing the others.
>>
>> Cool!
>>
>> > Nothing of note that I can see, if that usb hub-to-bus remapping is
>> > normal.  As you said, 'CPU0: local APIC error 0x40' looks maybe sus.
>> > Maybe someone who knows might comment on that?
>> >
>> > Just checking: you've tried other USB devices apart from uftdi0?
>>
>> Yup, there's no 5v on the port.
>
> Oh, BTW: can you check if you have power on the ports after the first
> resume and no power after all next resumes until you reboot your
> notebook?
> That's the situation I had and maybe it can lead to something. ;)
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Sanity Check on Mac Mini

2013-07-07 Thread Yonghyeon PYUN
On Sun, Jul 07, 2013 at 05:56:09PM -0700, Doug Hardie wrote:
> As I previously indicated, I have tested a couple more Minis and updated the 
> instructions with what I learned.  Here is the revised version:
> 

[...]

> 2.12.3Rebuilding the kernel to support the Ethernet Interface
> 
> Once the system has been rebooted, you will notice that ifconfig may not show 
> the ethernet interface.  There are at least two different chips being used 
> for that interface.  Some of the units work right out of the box.  Others do 
> not.  I have two units and the only visible difference is the Part No.  Part 
> Nu. MC815LL/A appears to be the older unit and the bge interface worked on 
> install.  Part No MD387LL/A is newer and has the newer chips that require the 
> driver update.
> 
>  If the bge interface does not show, then the bge driver needs to be updated 
> to recognize the NIC.  Mount the second memstick with the files retrieved 
> earlier and move them into the kernel source.  I used the following commands:
> 
> cp -p brgphy.c /usr/src/sys/dev/mii
> cp -p if_bgereg.h /usr/src/sys/dev/bge
> cp -p if_bge.c /usr/src/sys/dev/bge
> 
> then rebuild the kernel.  Note the instructions here are for GENERIC, but you 
> can use KERNCONF to specify a custom kernel.
> 
> cd /usr/src
> make buildkernel
> make installkernel
> 
> Reboot the server as before.  Now ifconfig will show bge0 and it will work.  
> The mini is now running a useable version of 9.1-Release.  There are still 
> some items remaining to be resolved:  Updating the kernel with the recent 
> security patches, Disabling Bluetooth and Wireless to save power, and 
> unattended rebooting.  These issues are still being addressed.
> 

I'm not sure whether this bge(4) controller is sitting behind
TB(Apple Thunderbolt) bridge. The Apple TB bridge has known
performance issue and some BCM controllers have a work-around to
mitigate it. The work-around is not enabled by default so I'm
interested in bge(4) performance numbers on your box. If you can't
get more than 920 ~ 930Mbps(950Mbps or higher with jumbo frame)
please let me know.
I didn't enable the work-around yet since it will hurt other BCM
controllers when TB bridge is absent.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: USB ports on Lenovo T400 do not work after a suspend/resume

2013-07-07 Thread Ian Smith
On Sun, 7 Jul 2013 18:47:03 -0700, Adrian Chadd wrote:
 > On 7 July 2013 13:49, Lars Engels  wrote:
 > > On Sun, Jun 30, 2013 at 03:02:57PM -0700, Adrian Chadd wrote:
 > >> On 30 June 2013 07:22, Ian Smith  wrote:
 > >>
 > >> > After removing [numbers] (for WITNESS?), diff started making sense.
 > >> > The below is between the first and second suspend/resume cycles in
 > >> > dmesg-3.txt, encompassing the others.
 > >>
 > >> Cool!
 > >>
 > >> > Nothing of note that I can see, if that usb hub-to-bus remapping is
 > >> > normal.  As you said, 'CPU0: local APIC error 0x40' looks maybe sus.
 > >> > Maybe someone who knows might comment on that?
 > >> >
 > >> > Just checking: you've tried other USB devices apart from uftdi0?
 > >>
 > >> Yup, there's no 5v on the port.
 > >
 > > Oh, BTW: can you check if you have power on the ports after the first
 > > resume and no power after all next resumes until you reboot your
 > > notebook?
 > > That's the situation I had and maybe it can lead to something. ;)

 > Nope, no power after first resume if i have nothing plugged in.
 > 
 > Why?

Checking one more point .. do the USB ports come up ok if you originally 
boot with nothing plugged in?  If so (or if not), does that local APIC 
error message appear the same then too?

cheers, Ian

PS OT: finally found a USB keyboard but I'd forgotten that my friend's 
machine is an SL500, not T500.  Moreover, because its keyboard+trackpad 
etc is non-working (internally disconnected), I have no way to resume it 
without the kbd (and the 9.1-R memstick) plugged in.  Even with kbd and 
memstick left in and using acpiconf -s3 it suspends ok but is hung after 
resume by dabbing power button; no screen and kbd is dead too - sorry, 
no help there.  OTOH my son just bought a refurb T430 ('doze 7, beats 8 
anyway) which I should get to play with a bit this week.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Shutdown hangs on unmount of a gjournaled file system in 8-Stable

2013-07-07 Thread Konstantin Belousov
On Mon, Jul 08, 2013 at 12:26:43AM +0200, Andreas Longwitz wrote:
> The deadlock can be explained now: pid 1 (init) sleeps on "mount drain"
> because mp->mnt_lockref was 1. This setting was done by pid 18 (gjournal
> switcher) by calling vfs_busy(). pid 18 now sleeps on "suspwt" because
> mp->mnt_writeopcount was 1. This setting was done by pid 1 before going
> to sleep by calling vn_start_write() in dounmount().
> 
> I think the reason for this deadlock is the commit r249055 which seems
> not to be compatible with gjournal.
Thank you for the analysis. I think 'not compatible' is some
understatement. The situation clearly causes a deadlock, you are right.

The vfs_busy(); vfs_write_suspend(); call sequence is somewhat dubious,
in fact, exactly because unmount could start in between. I think that
vfs_write_suspend() must avoid setting MNT_SUSPEND if unmount was
started. Patch below, for HEAD, should fix the problem, by marking the
callers of vfs_write_suspend(), which are not protected by the covered
vnode lock, with the VS_SKIP_UNMOUNT flag.

I believe that the conflicts on stable/8 should be trivial, if any.

diff --git a/sys/geom/journal/g_journal.c b/sys/geom/journal/g_journal.c
index a3c996c..3ce2785 100644
--- a/sys/geom/journal/g_journal.c
+++ b/sys/geom/journal/g_journal.c
@@ -2960,7 +2960,7 @@ g_journal_do_switch(struct g_class *classp)
GJ_TIMER_STOP(1, &bt, "BIO_FLUSH time of %s", sc->sc_name);
 
GJ_TIMER_START(1, &bt);
-   error = vfs_write_suspend(mp);
+   error = vfs_write_suspend(mp, VS_SKIP_UNMOUNT);
GJ_TIMER_STOP(1, &bt, "Suspend time of %s", mountpoint);
if (error != 0) {
GJ_DEBUG(0, "Cannot suspend file system %s (error=%d).",
diff --git a/sys/kern/vfs_vnops.c b/sys/kern/vfs_vnops.c
index 7eac0ef..06e59f9 100644
--- a/sys/kern/vfs_vnops.c
+++ b/sys/kern/vfs_vnops.c
@@ -1668,8 +1668,7 @@ vn_finished_secondary_write(mp)
  * Request a filesystem to suspend write operations.
  */
 int
-vfs_write_suspend(mp)
-   struct mount *mp;
+vfs_write_suspend(struct mount *mp, int flags)
 {
int error;
 
@@ -1680,6 +1679,21 @@ vfs_write_suspend(mp)
}
while (mp->mnt_kern_flag & MNTK_SUSPEND)
msleep(&mp->mnt_flag, MNT_MTX(mp), PUSER - 1, "wsuspfs", 0);
+
+   /*
+* Unmount holds a write reference on the mount point.  If we
+* own busy reference and drain for writers, we deadlock with
+* the reference draining in the unmount path.  Callers of
+* vfs_write_suspend() must specify VS_SKIP_UNMOUNT if
+* vfs_busy() reference is owned and caller is not in the
+* unmount context.
+*/
+   if ((flags & VS_SKIP_UNMOUNT) != 0 &&
+   (mp->mnt_kern_flag & MNTK_UNMOUNT) != 0) {
+   MNT_IUNLOCK(mp);
+   return (EBUSY);
+   }
+
mp->mnt_kern_flag |= MNTK_SUSPEND;
mp->mnt_susp_owner = curthread;
if (mp->mnt_writeopcount > 0)
diff --git a/sys/sys/vnode.h b/sys/sys/vnode.h
index 42bfb65..b0cbcc0 100644
--- a/sys/sys/vnode.h
+++ b/sys/sys/vnode.h
@@ -398,6 +398,9 @@ extern int  vttoif_tab[];
 #defineVR_START_WRITE  0x0001  /* vfs_write_resume: start write 
atomically */
 #defineVR_NO_SUSPCLR   0x0002  /* vfs_write_resume: do not clear 
suspension */
 
+#defineVS_SKIP_UNMOUNT 0x0001  /* vfs_write_suspend: fail if the
+  filesystem is being unmounted */
+
 #defineVREF(vp)vref(vp)
 
 #ifdef DIAGNOSTIC
@@ -711,7 +714,7 @@ int vn_io_fault_pgmove(vm_page_t ma[], vm_offset_t offset, 
int xfersize,
 intvfs_cache_lookup(struct vop_lookup_args *ap);
 void   vfs_timestamp(struct timespec *);
 void   vfs_write_resume(struct mount *mp, int flags);
-intvfs_write_suspend(struct mount *mp);
+intvfs_write_suspend(struct mount *mp, int flags);
 intvop_stdbmap(struct vop_bmap_args *);
 intvop_stdfsync(struct vop_fsync_args *);
 intvop_stdgetwritemount(struct vop_getwritemount_args *);
diff --git a/sys/ufs/ffs/ffs_snapshot.c b/sys/ufs/ffs/ffs_snapshot.c
index 9a9c88a..ad157aa 100644
--- a/sys/ufs/ffs/ffs_snapshot.c
+++ b/sys/ufs/ffs/ffs_snapshot.c
@@ -423,7 +423,7 @@ restart:
 */
for (;;) {
vn_finished_write(wrtmp);
-   if ((error = vfs_write_suspend(vp->v_mount)) != 0) {
+   if ((error = vfs_write_suspend(vp->v_mount, 0)) != 0) {
vn_start_write(NULL, &wrtmp, V_WAIT);
vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
goto out;
diff --git a/sys/ufs/ffs/ffs_suspend.c b/sys/ufs/ffs/ffs_suspend.c
index 3198c1a..a8c4578 100644
--- a/sys/ufs/ffs/ffs_suspend.c
+++ b/sys/ufs/ffs/ffs_suspend.c
@@ -206,7 +206,7 @@ ffs_susp_suspend(struct mount *mp)
return (EPERM);
 #endif
 
-   if ((error = vfs_write_suspend(mp)) != 0)
+   if ((error = vfs_write_

Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found

2013-07-07 Thread Andre Albsmeier
On Sun, 07-Jul-2013 at 14:32:17 +0200, Jeremy Chadwick wrote:
> On Sun, Jul 07, 2013 at 02:13:54PM +0200, Andre Albsmeier wrote:
> > On Sun, 07-Jul-2013 at 09:41:12 +0200, Konstantin Belousov wrote:
> > > On Sun, Jul 07, 2013 at 09:25:53AM +0200, Andre Albsmeier wrote:
> > > > OK, here we go (looks better now):
> > > > 
> > > > GNU gdb 6.1.1 [FreeBSD]
> > > > Copyright 2004 Free Software Foundation, Inc.
> > > > GDB is free software, covered by the GNU General Public License, and 
> > > > you are
> > > > welcome to change it and/or distribute copies of it under certain 
> > > > conditions.
> > > > Type "show copying" to see the conditions.
> > > > There is absolutely no warranty for GDB.  Type "show warranty" for 
> > > > details.
> > > > This GDB was configured as "i386-marcel-freebsd"...
> > > > 
> > > > Unread portion of the kernel message buffer:
> > > > dev = stripe/p, block = 592, fs = /palveli
> > > > panic: ffs_blkfree_cg: freeing free block
> > > > KDB: stack backtrace:
> > > > db_trace_self_wrapper(c08207eb,d70fc924,c05fdfc9,c081df13,c08a82e0,...) 
> > > > at db_trace_self_wrapper+0x26/frame 0xd70fc8f4
> > > > kdb_backtrace(c081df13,c08a82e0,c0833a0b,d70fc930,d70fc930,...) at 
> > > > kdb_backtrace+0x29/frame 0xd70fc900
> > > > panic(c0833a0b,c2aae178,250,0,c2af80d4,...) at panic+0xc9/frame 
> > > > 0xd70fc924
> > > > ffs_blkfree_cg(250,0,8000,49f,d70fcad0,...) at 
> > > > ffs_blkfree_cg+0x399/frame 0xd70fc9c8
> > > > ffs_blkfree(c2b35100,c2af8000,c2b0d470,250,0,...) at 
> > > > ffs_blkfree+0xad/frame 0xd70fca00
> > > > indir_trunc(fffa3ff4,,0,8000,0,...) at indir_trunc+0x658/frame 
> > > > 0xd70fcae0
> > > > indir_trunc(dff3,,c072df0a,c2d68d00,c087abd8,...) at 
> > > > indir_trunc+0x514/frame 0xd70fcbc0
> > > > handle_workitem_freeblocks(0,d70fcc4c,2,246,c2ab1000,...) at 
> > > > handle_workitem_freeblocks+0x2dc/frame 0xd70fcc24
> > > > process_worklist_item(0,0,0,c086ae78,0,...) at 
> > > > process_worklist_item+0x27a/frame 0xd70fcc6c
> > > > softdep_process_worklist(c2b36548,0,54,c0835825,64,...) at 
> > > > softdep_process_worklist+0x91/frame 0xd70fcc9c
> > > > softdep_flush(0,d70fcd08,0,c2aac2f0,0,...) at softdep_flush+0x3e4/frame 
> > > > 0xd70f
> > > > fork_exit(c0738bb0,0,d70fcd08) at fork_exit+0xa2/frame 0xd70fccf4
> > > > fork_trampoline() at fork_trampoline+0x8/frame 0xd70fccf4
> > > > --- trap 0, eip = 0, esp = 0xd70fcd40, ebp = 0 ---
> > > > Uptime: 2d16h29m37s
> > > > Physical memory: 503 MB
> > > > Dumping 95 MB: 80 64 48 32 16
> > > > 
> > > > No symbol "stopped_cpus" in current context.
> > > > No symbol "stoppcbs" in current context.
> > > > #0  doadump (textdump=1) at pcpu.h:249
> > > > 249 pcpu.h: No such file or directory.
> > > > in pcpu.h
> > > > (kgdb) where
> > > > #0  doadump (textdump=1) at pcpu.h:249
> > > > #1  0xc05f in kern_reboot (howto=260) at 
> > > > /src/src-9/sys/kern/kern_shutdown.c:449
> > > > #2  0xc05fe028 in panic (fmt=) at 
> > > > /src/src-9/sys/kern/kern_shutdown.c:637
> > > > #3  0xc0717899 in ffs_blkfree_cg (ump=0xc2b35100, fs=0xc2af8000, 
> > > > devvp=0xc2b0d470, bno=592, 
> > > > size=32768, inum=1183, dephd=0xd70fcad0) at 
> > > > /src/src-9/sys/ufs/ffs/ffs_alloc.c:2151
> > > > #4  0xc0717c8d in ffs_blkfree (ump=0xc2b35100, fs=0xc2af8000, 
> > > > devvp=0xc2b0d470, bno=592, 
> > > > size=32768, inum=1183, vtype=VREG, dephd=0xd70fcad0) at 
> > > > /src/src-9/sys/ufs/ffs/ffs_alloc.c:2280
> > > > #5  0xc0730348 in indir_trunc (freework=0xc2f99100, dbn=1642816, 
> > > > lbn=-376844)
> > > > at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7965
> > > > #6  0xc0730204 in indir_trunc (freework=0xc2f99100, dbn=1639680, 
> > > > lbn=-8205)
> > > > at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7946
> > > > #7  0xc07324bc in handle_workitem_freeblocks (freeblks=0xc2fc1e00, 
> > > > flags=512)
> > > > at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7588
> > > > #8  0xc0730dfa in process_worklist_item (mp=0xc2b36548, target=10, 
> > > > flags=512)
> > > > at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1774
> > > > #9  0xc07360c1 in softdep_process_worklist (mp=0xc2b36548, full=0)
> > > > at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1558
> > > > #10 0xc0738f94 in softdep_flush () at 
> > > > /src/src-9/sys/ufs/ffs/ffs_softdep.c:1414
> > > > #11 0xc05d1b82 in fork_exit (callout=0xc0738bb0 , 
> > > > arg=0x0, frame=0xd70fcd08)
> > > > at /src/src-9/sys/kern/kern_fork.c:988
> > > > #12 0xc07ba904 in fork_trampoline () at 
> > > > /src/src-9/sys/i386/i386/exception.s:279
> > > > (kgdb) up 10
> > > > #10 0xc0738f94 in softdep_flush () at 
> > > > /src/src-9/sys/ufs/ffs/ffs_softdep.c:1414
> > > > 1414progress += 
> > > > softdep_process_worklist(mp, 0);
> > > > 
> > > > -Andre
> > > 
> > > This looks unrelated, and exactly this panic is usually has one of two
> > > causes:
> > > - corrupted filesystem, run fsck to recheck it;
> > 
> > root@palveli:~>fsck /dev/stripe/p 
> > ** /dev/stripe/p
> >