Re: freebsd 9-stable TOP problem from around Jan 10
On 2/14/12 4:20 PM, Jeremy Chadwick wrote: On Tue, Feb 14, 2012 at 03:35:01PM -0800, Julian Elischer wrote: On 2/14/12 10:38 AM, Kevin Oberman wrote: On Tue, Feb 14, 2012 at 12:23 AM, Julian Elischer wrote: Has anyone else seen a problem with top -H -S? after a short while the screen gets more and more corrupted.. hitting ^L or turning off S& H modes helps .. for a while. If this is a known fixed problem, let me know but I need to co-ordinate with others to upgrade the machine in question. Not seeing it here on 9-stable. Could it be a display issue? I am using gnome-terminal with TERM defined as 'xterm'. yeah I'm on a mac with iterm, but running through 'screen' . it's never been a problem before.. just since we upgraded to 9-stable. If you remove GNU screen from the picture does the problem go away? If so, I'm not surprised. :-) Make sure that when you're using GNU screen, that all shells launched "under/within" screen have TERM=screen. If they don't, then this is almost certainly the problem -- GNU screen "translates" between terminal types, meaning it translates its own terminal type ("screen") into whatever TERM is currently attached ("xterm", "iterm", whatever). See the last 4 paragraphs of my post here to understand what exactly GNU screen is doing: http://lists.freebsd.org/pipermail/freebsd-stable/2011-June/063052.html So, in general, make sure your dotfiles and so on don't mess about with the $TERM environment variable and you should generally be okay. it seems to have stopped doing it for no apparent reason will keep an eye on it. and save this email away for when it does it again. If within GNU screen TERM=screen and you see the problem, but outside of screen you use TERM=xterm (or something else) but don't see the problem, then I would almost certainly blame GNU screen. If you're looking for something that simply keeps a terminal running in the background, try nohup or tmux. Alternately, possibly someone added a "screen" entry to /etc/termcap on RELENG_9? I don't use 9 so I have no way to confirm this, but on 8 there is no such entry. SC|screen|VT 100/ANSI X3.64 virtual terminal:\ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: sysutils/pftop on 9.x+
On Tue, 14 Feb 2012, Florian Smeets wrote: On 14.02.12 17:14, Fabian Keil wrote: Greg Rivers wrote: sysutils/pftop was marked broken on 9.x and above last March[1]. Are there any plans to fix it soon? It's a really handy utility. [1] http://www.freebsd.org/cgi/cvsweb.cgi/ports/sysutils/pftop/Makefile?rev=1.17 Please have a look at: http://www.freebsd.org/cgi/query-pr.cgi?pr=155938 Note that the currently working fix is in the audit trail, the original fix stopped working after the PF update. The PR was closed by mistake, I'll take care of it. Thanks for committing the fix, Florian. pftop now builds and runs fine; tested on recent 9.0-STABLE amd64. Thanks also to Patrick for his input and especially to Fabian for creating the patches and filing the PR. -- Greg Rivers ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: disk devices speed is ugly
On Tue, Feb 14, 2012 at 10:50 PM, Scott Long wrote: > > Any filesystem that uses bread/bwrite/cluster_read are already using the > "generic caching subsystem" that you propose. This includes UDF, CD9660, > MSDOS, NTFS, XFS, ReiserFS, EXT2FS, and HPFS, i.e. every local storage > filesystem in the tree except for ZFS. Not all of them implement > VOP_GETPAGES/VOP_PUTPAGES, but those are just optimizations for the vnode > pager, not requirements for using buffer-cache services on block devices. > As Kostik pointed out in a parallel email, the only thing that was removed > from FreeBSD was the userland interface to cached devices via /dev nodes. > Does this mean the Architecture Handbook page is wrong?: http://www.freebsd.org/doc/en/books/arch-handbook/driverbasics-block.html -- Adam Vande More ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
[releng_8 tinderbox] failure on ia64/ia64
TB --- 2012-02-15 04:26:47 - tinderbox 2.9 running on freebsd-legacy2.sentex.ca TB --- 2012-02-15 04:26:47 - starting RELENG_8 tinderbox run for ia64/ia64 TB --- 2012-02-15 04:26:47 - cleaning the object tree TB --- 2012-02-15 04:27:08 - cvsupping the source tree TB --- 2012-02-15 04:27:08 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup.sentex.ca /tinderbox/RELENG_8/ia64/ia64/supfile TB --- 2012-02-15 04:32:32 - building world TB --- 2012-02-15 04:32:32 - CROSS_BUILD_TESTING=YES TB --- 2012-02-15 04:32:32 - MAKEOBJDIRPREFIX=/obj TB --- 2012-02-15 04:32:32 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2012-02-15 04:32:32 - SRCCONF=/dev/null TB --- 2012-02-15 04:32:32 - TARGET=ia64 TB --- 2012-02-15 04:32:32 - TARGET_ARCH=ia64 TB --- 2012-02-15 04:32:32 - TZ=UTC TB --- 2012-02-15 04:32:32 - __MAKE_CONF=/dev/null TB --- 2012-02-15 04:32:32 - cd /src TB --- 2012-02-15 04:32:32 - /usr/bin/make -B buildworld >>> World build started on Wed Feb 15 04:32:33 UTC 2012 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything >>> World build completed on Wed Feb 15 05:32:53 UTC 2012 TB --- 2012-02-15 05:32:53 - generating LINT kernel config TB --- 2012-02-15 05:32:53 - cd /src/sys/ia64/conf TB --- 2012-02-15 05:32:53 - /usr/bin/make -B LINT TB --- 2012-02-15 05:32:53 - cd /src/sys/ia64/conf TB --- 2012-02-15 05:32:53 - /usr/sbin/config -m LINT TB --- 2012-02-15 05:32:53 - building LINT kernel TB --- 2012-02-15 05:32:53 - CROSS_BUILD_TESTING=YES TB --- 2012-02-15 05:32:53 - MAKEOBJDIRPREFIX=/obj TB --- 2012-02-15 05:32:53 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2012-02-15 05:32:53 - SRCCONF=/dev/null TB --- 2012-02-15 05:32:53 - TARGET=ia64 TB --- 2012-02-15 05:32:53 - TARGET_ARCH=ia64 TB --- 2012-02-15 05:32:53 - TZ=UTC TB --- 2012-02-15 05:32:53 - __MAKE_CONF=/dev/null TB --- 2012-02-15 05:32:53 - cd /src TB --- 2012-02-15 05:32:53 - /usr/bin/make -B buildkernel KERNCONF=LINT >>> Kernel build for LINT started on Wed Feb 15 05:32:53 UTC 2012 >>> stage 1: configuring the kernel >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3.1: making dependencies >>> stage 3.2: building everything [...] cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ia64/libuwx/src -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mconstant-gp -ffixed-r13 -mfixed-range=f32-f127 -fpic -ffreestanding -Werror /src/sys/dev/mxge/mxge_ethp_z8e.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ia64/libuwx/src -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mconstant-gp -ffixed-r13 -mfixed-range=f32-f127 -fpic -ffreestanding -Werror /src/sys/dev/mxge/mxge_rss_eth_z8e.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ia64/libuwx/src -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mconstant-gp -ffixed-r13 -mfixed-range=f32-f127 -fpic -ffreestanding -Werror /src/sys/dev/mxge/mxge_rss_ethp_z8e.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ia64/libuwx/src -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mconstant-gp -ffixed-r13 -mfixed-range=f32-f127 -fpic -ffreestanding -Werror /src/sys/dev/my/if_my.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototype
Re: 6.2-Release ..ish.. CF + ata == freeze?
2 of the 3 cf cards are very new, like less then 6 months old. I think around 65-70 percent is in use. This number doesn't change unless the user dumps data in a home dir, which isn't the case so far. You are correct that only writes are failing. Msgbuf has more then what I pasted but I'm pretty sure its just more of the same errors. Ill redouble my check. The other slices are very small. One is 35 meg the other is 100 some odd meg. H is 1.2 gig. I don't know if ill be able to try the dd test for a few reasons but ill check it out. Let me ask you this. Say zeroing out the drive works without error. Does that tell me anything? I also don't have access to smart tools as this is basically a closed system and the vendor would never give us access to a complier. Granted I haven't tried just throwing on gcc from 6.2. I could play with that or maybe since said vendor's dev team is keeping track of this thread they could provide said binary :). I really don't like the idea of replacing hardware as I'm looking at around 200 boxes. I really hope it doesn't come to that. Thanks for the reply! Sent via BlackBerry from T-Mobile -Original Message- From: Jeremy Chadwick Date: Mon, 13 Feb 2012 21:18:28 To: john fleming Cc: freebsd-stable@freebsd.org Subject: Re: 6.2-Release ..ish.. CF + ata == freeze? On Mon, Feb 13, 2012 at 08:43:08PM -0800, john fleming wrote: > Just thought i would post over here as i'm not getting a warm fuzzy from > checkpoint about being able to find the root cause of an issue. I have a > large install base of IPSO checkpoint firewalls, which are based on FreeBSD > 6.2. I've had 3 firewalls hang basically the same way, with something that > looks like a filesystem issue or an?issue with a CF card. FreeBSD 6.2 was EOL'd in early-to-mid-2008. The ATA driver has changed significantly since then (present-day uses CAM). > Does anyone happen to know of any bugs (i've been looking around) that could > cause something like that? Granted, it could be a batch of bad CF cards, but > its odd that i'm seeing the same thing on 3 different boxes and once rebooted > they seem ok. > ? > Also is it possible to get useful info form the atacontroller when things go > south like this from the ddb prompt? Not particularly. What's shown below indicates that the driver had issued some form of ATA write command (there are multiple kinds per ATA specification), and either the underlying media (CF/disk) or controller stalled/locked up/took too long. I forget what the timeout value is in 6.2; I can't be bothered to remember such from 6 years ago. :-) > This is what shows in show msgbuf > ad0: timeout waiting to issue command > ad0: error issuing WRITE command > ad0: timeout waiting to issue command > ad0: error issuing WRITE command > ad0: timeout waiting to issue command > ad0: error issuing WRITE command > ad0: timeout waiting to issue command > ad0: error issuing WRITE command > g_vfs_done():ad0s4h[WRITE(offset=33849344, length=131072)]error = 5 > g_vfs_done():ad0s4h[WRITE(offset=33980416, length=131072)]error = 5 > g_vfs_done():ad0s4h[WRITE(offset=34111488, length=131072)]error = 5 > ?g_vfs_done():ad0s4h[WRITE(offset=34242560, length=131072)]error = 5 > g_vfs_done():ad0s4h[WRITE(offset=34373632, length=131072)]error = 5 error 5 = EIO = Input/output error. But this isn't too big of a surprise given the timeouts you see prior. Are these CF cards brand new -- meaning, are they completely unused (having never had any writes done to them), or have they been in use a while? I'm betting they've been in use a while, and have probably been doing many writes over the years. Two things to note here: 1) The errors you've shown are only happening on writes, not reads. Of course if you omitted information then this isn't an accurate statement. 2) Timeouts are seen when issuing writes to some LBA regions. How full is the CF card, disk-space-wise? Not just ad0s4h, I'm talking about the entire card. How much space is roughly available? They're very small CF cards (1.8GByte roughly), and the less space available, the less effectiveness of wear levelling (and in some cases the slower the writes are). Reason I ask: given that these are CF cards, this smells of cards which are simply "worn down". CF cards have limited numbers of writes, and the card may be "freaking out" internally when attempting to write to some LBAs which map to CF sectors that are, in effect, "bad". The CF cards' ECC implementation may be buggy, or may simply be "spinning hard" for too long. You can read about this sort of behaviour on Wikipedia's CompactFlash article. You wouldn't be able to verify this with dd if=/dev/ad0, because those are read operations. You could zero the media (dd if=/dev/zero of=/dev/ad0) as a form of verification if you wanted. Do you happen to know if these CF cards support SMART? If so, installing smartmontools (version 5.42 or newer please) and providing output from
Re: disk devices speed is ugly
On Feb 14, 2012, at 1:02 PM, Peter Jeremy wrote: > On 2012-Feb-13 08:28:21 -0500, Gary Palmer wrote: >> The filesystem is the *BEST* place to do caching. It knows what metadata >> is most effective to cache and what other data (e.g. file contents) doesn't >> need to be cached. > > Agreed. > >> Any attempt to do this in layers between the FS and >> the disk won't achieve the same gains as a properly written filesystem. > > Agreed - but traditionally, Unix uses this approach via block devices. > For various reasons, FreeBSD moved caching into UFS and removed block > devices. Unfortunately, this means that any FS that wants caching has > to implement its own - and currently only UFS & ZFS do. > > What would be nice is a generic caching subsystem that any FS can use > - similar to the old block devices but with hooks to allow the FS to > request read-ahead, advise of unwanted blocks and ability to flush > dirty blocks in a requested order with the equivalent of barriers > (request Y will not occur until preceeding request X has been > committed to stable media). This would allow filesystems to regain > the benefits of block devices with minimal effort and then improve > performance & cache efficiency with additional work. > Any filesystem that uses bread/bwrite/cluster_read are already using the "generic caching subsystem" that you propose. This includes UDF, CD9660, MSDOS, NTFS, XFS, ReiserFS, EXT2FS, and HPFS, i.e. every local storage filesystem in the tree except for ZFS. Not all of them implement VOP_GETPAGES/VOP_PUTPAGES, but those are just optimizations for the vnode pager, not requirements for using buffer-cache services on block devices. As Kostik pointed out in a parallel email, the only thing that was removed from FreeBSD was the userland interface to cached devices via /dev nodes. This has nothing to do with filesystems, though I suppose that could maybe sorta kinda be an issue for FUSE?. ZFS isn't in this list because it implements its own private buffer/cache (the ARC) that understands the special requirements of ZFS. There are good and bad aspects to this, noted below. > One downside of the "each FS does its own caching" in that the caches > are all separate and need careful integration into the VM subsystem to > prevent starvation (eg past problems with UFS starving ZFS L2ARC). > I'm not sure what you mean here. The ARC is limited by available wired memory; attempts to allocate such memory will evict pages from the buffer cache as necessary, until all available RAM is consumed. If anything, ZFS starves the rest of the system, not the other way around, and that's simply because the ARC isn't integrated with the normal VM. Such integration is extremely hard and has nothing to do with having a generic caching subsystem. Scott ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 9-stable - ifmedia_set: no match for 0x0/0xfffffff
On Sun, Jan 29, 2012 at 01:19:40PM +0900, Randy Bush wrote: > > What happens if you set hw.bge.allow_asf to 0 and use auto-negotiation > > on both sides? > > it works! the switch was already auto-neg, and i forced auto-neg on the > server side. > Apart from suspend/resume issue, bge(4) still needs more code to handle controllers with ASF/IPMI firmware. This part is mostly undocumented and hard to experiment due to lack of hardware access. Current IPMI/ASF handling code shows mixed results and setting hw.bge.allow_asf to 0 will break IPMI support. > thanks. this was not pleasant. did i remember to whine that i am in > tokyo and the server is on the beast coast of the states? :) > > i think a bit of a warning about hw.bge.allow_asf in UPDATING might help > folk. > > thank you *very* much for your help. > > randy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: CARP carpdev
On 14. Feb 2012, at 22:04 , Hugo Silva wrote: > On 02/14/12 17:33, Freddie Cash wrote: >> On Tue, Feb 14, 2012 at 8:56 AM, Hugo Silva wrote: >>> Looks like there's been conversations about porting this to FreeBSD since at >>> least 2007. >>> >>> Are there any plans to have ifconfig carpdev available in 9.0-STABLE? >> >> CARP support has been redone in 10-CURRENT, removing the whole "carp0" >> pseudo-interface support, and just enabling the CARP protocol on the >> existing network interfaces. This includes the equivalent of "carpdev" >> support. >> >> Search the -current archives for more information, CFT, and so on. >> >> I don't recall seeing anything about specific plans to MFC to >> stable/9, but could be mis-remembering things. >> > > > http://svnweb.freebsd.org/base?view=revision&revision=228571 > > The single IP limitation may be a problem in some locations.. > > Did not find anything about a possible MFC either. glebius@ is cc'd, perhaps > he can add something, but based on > http://svn.freebsd.org/base/stable/9/UPDATING, I don't think it's been MFCd > (there's a primer for the new carp in current's UPDATING)\ There's no plans to MFC given it changes things significantly. I however wonder if someone wants to provide a user branch in SVN to provide regular patchsets for stable/9 and maybe even stable/8 (8.3R) to help people not going to HEAD? -- Bjoern A. Zeeb You have to have visions! It does not matter how good you are. It matters what good you do! ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
[releng_8 tinderbox] failure on ia64/ia64
TB --- 2012-02-15 00:31:23 - tinderbox 2.9 running on freebsd-legacy2.sentex.ca TB --- 2012-02-15 00:31:23 - starting RELENG_8 tinderbox run for ia64/ia64 TB --- 2012-02-15 00:31:23 - cleaning the object tree TB --- 2012-02-15 00:31:23 - cvsupping the source tree TB --- 2012-02-15 00:31:23 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup.sentex.ca /tinderbox/RELENG_8/ia64/ia64/supfile TB --- 2012-02-15 00:36:50 - building world TB --- 2012-02-15 00:36:50 - CROSS_BUILD_TESTING=YES TB --- 2012-02-15 00:36:50 - MAKEOBJDIRPREFIX=/obj TB --- 2012-02-15 00:36:50 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2012-02-15 00:36:50 - SRCCONF=/dev/null TB --- 2012-02-15 00:36:50 - TARGET=ia64 TB --- 2012-02-15 00:36:50 - TARGET_ARCH=ia64 TB --- 2012-02-15 00:36:50 - TZ=UTC TB --- 2012-02-15 00:36:50 - __MAKE_CONF=/dev/null TB --- 2012-02-15 00:36:50 - cd /src TB --- 2012-02-15 00:36:50 - /usr/bin/make -B buildworld >>> World build started on Wed Feb 15 00:36:50 UTC 2012 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything >>> World build completed on Wed Feb 15 01:37:12 UTC 2012 TB --- 2012-02-15 01:37:12 - generating LINT kernel config TB --- 2012-02-15 01:37:12 - cd /src/sys/ia64/conf TB --- 2012-02-15 01:37:12 - /usr/bin/make -B LINT TB --- 2012-02-15 01:37:12 - cd /src/sys/ia64/conf TB --- 2012-02-15 01:37:12 - /usr/sbin/config -m LINT TB --- 2012-02-15 01:37:12 - building LINT kernel TB --- 2012-02-15 01:37:12 - CROSS_BUILD_TESTING=YES TB --- 2012-02-15 01:37:12 - MAKEOBJDIRPREFIX=/obj TB --- 2012-02-15 01:37:12 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2012-02-15 01:37:12 - SRCCONF=/dev/null TB --- 2012-02-15 01:37:12 - TARGET=ia64 TB --- 2012-02-15 01:37:12 - TARGET_ARCH=ia64 TB --- 2012-02-15 01:37:12 - TZ=UTC TB --- 2012-02-15 01:37:12 - __MAKE_CONF=/dev/null TB --- 2012-02-15 01:37:12 - cd /src TB --- 2012-02-15 01:37:12 - /usr/bin/make -B buildkernel KERNCONF=LINT >>> Kernel build for LINT started on Wed Feb 15 01:37:12 UTC 2012 >>> stage 1: configuring the kernel >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3.1: making dependencies >>> stage 3.2: building everything [...] cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ia64/libuwx/src -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mconstant-gp -ffixed-r13 -mfixed-range=f32-f127 -fpic -ffreestanding -Werror /src/sys/dev/mxge/mxge_ethp_z8e.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ia64/libuwx/src -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mconstant-gp -ffixed-r13 -mfixed-range=f32-f127 -fpic -ffreestanding -Werror /src/sys/dev/mxge/mxge_rss_eth_z8e.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ia64/libuwx/src -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mconstant-gp -ffixed-r13 -mfixed-range=f32-f127 -fpic -ffreestanding -Werror /src/sys/dev/mxge/mxge_rss_ethp_z8e.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ia64/libuwx/src -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mconstant-gp -ffixed-r13 -mfixed-range=f32-f127 -fpic -ffreestanding -Werror /src/sys/dev/my/if_my.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototype
Re: disk devices speed is ugly
On Wed, Feb 15, 2012 at 07:02:58AM +1100, Peter Jeremy wrote: > On 2012-Feb-13 08:28:21 -0500, Gary Palmer wrote: > >The filesystem is the *BEST* place to do caching. It knows what metadata > >is most effective to cache and what other data (e.g. file contents) doesn't > >need to be cached. > > Agreed. > > > Any attempt to do this in layers between the FS and > >the disk won't achieve the same gains as a properly written filesystem. > > Agreed - but traditionally, Unix uses this approach via block devices. > For various reasons, FreeBSD moved caching into UFS and removed block > devices. Unfortunately, this means that any FS that wants caching has > to implement its own - and currently only UFS & ZFS do. Block caching is still there, only user-accessible interface was removed. UFS utilizes the buffer cache for the device which carries the volume, for metadata caching. There are some memory areas in UFS which can be classified as caches on its own, but their existence is mostly to support operation, and not caching (e.g. the inodeblock copy accompaniying each inode). > > What would be nice is a generic caching subsystem that any FS can use > - similar to the old block devices but with hooks to allow the FS to > request read-ahead, advise of unwanted blocks and ability to flush > dirty blocks in a requested order with the equivalent of barriers > (request Y will not occur until preceeding request X has been > committed to stable media). This would allow filesystems to regain > the benefits of block devices with minimal effort and then improve > performance & cache efficiency with additional work. > > One downside of the "each FS does its own caching" in that the caches > are all separate and need careful integration into the VM subsystem to > prevent starvation (eg past problems with UFS starving ZFS L2ARC). Other filesystems which use vfs_bio, like cd9660 or ufs, use the same disk cache layer as UFS. pgpqbAGs3GLrm.pgp Description: PGP signature
Re: ZFS + nullfs + Linuxulator = panic?
On Tue, Feb 14, 2012 at 09:38:18AM -0500, Paul Mather wrote: > I have a problem with RELENG_8 (FreeBSD/amd64 running a GENERIC kernel, last > built 2012-02-08). It will panic during the daily periodic scripts that run > at 3am. Here is the most recent panic message: > > Fatal trap 9: general protection fault while in kernel mode > cpuid = 0; apic id = 00 > instruction pointer = 0x20:0x8069d266 > stack pointer = 0x28:0xff8094b90390 > frame pointer = 0x28:0xff8094b903a0 > code segment= base 0x0, limit 0xf, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags= resume, IOPL = 0 > current process = 72566 (ps) > trap number = 9 > panic: general protection fault > cpuid = 0 > KDB: stack backtrace: > #0 0x8062cf8e at kdb_backtrace+0x5e > #1 0x805facd3 at panic+0x183 > #2 0x808e6c20 at trap_fatal+0x290 > #3 0x808e715a at trap+0x10a > #4 0x808cec64 at calltrap+0x8 > #5 0x805ee034 at fill_kinfo_thread+0x54 > #6 0x805eee76 at fill_kinfo_proc+0x586 > #7 0x805f22b8 at sysctl_out_proc+0x48 > #8 0x805f26c8 at sysctl_kern_proc+0x278 > #9 0x8060473f at sysctl_root+0x14f > #10 0x80604a2a at userland_sysctl+0x14a > #11 0x80604f1a at __sysctl+0xaa > #12 0x808e62d4 at amd64_syscall+0x1f4 > #13 0x808cef5c at Xfast_syscall+0xfc Please look up the line number for the fill_kinfo_thread+0x54. pgpJipexj3Uac.pgp Description: PGP signature
RE: Reducing the need to compile a custom kernel
>> - CPU_SOEKRIS, CPU_GEODE, CPU_ELAN, NO_SWAPPING for embedded devices > >Embedded devices are out of the scope of this, normally you do a lot of other modifictions to such systems anyway, so a custom kernel should be not a >big problem. Just as a quick data point here, I have just installed FreeBSD onto an ALIX system and was hoping to keep everything very standard. Turns out that I needed to rebuild the kernel to add CPU_GEODE to get a few simple features added. Everything else is standard GENERIC because I'm too lazy to fine tune. The geode code is very small and I would expect completely harmless if left enabled in GENERIC. The overhead of including it for other systems would be a few extra compares during startup and a k or so extra size in the kernel. I would suggest that avoiding custom kernels to make trivial changes is exactly what you should be looking at. Make features like this removable for the people who want to fine tune their kernels but include for people who are happy to have a little overhead as a trade of for ease of management. The only other thing that regularly has me running custom kernels is IPFIREWALL_FORWARD. As others have said, I'd be very happy if that was the default but removable. Brian Scott ** This message is intended for the addressee named and may contain privileged information or confidential information or both. If you are not the intended recipient please delete it and notify the sender. ** ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: ZFS + nullfs + Linuxulator = panic?
On Tue, Feb 14, 2012 at 09:38:18AM -0500, Paul Mather wrote: > I have a problem with RELENG_8 (FreeBSD/amd64 running a GENERIC kernel, last > built 2012-02-08). It will panic during the daily periodic scripts that run > at 3am. Here is the most recent panic message: > > Fatal trap 9: general protection fault while in kernel mode > cpuid = 0; apic id = 00 > instruction pointer = 0x20:0x8069d266 > stack pointer = 0x28:0xff8094b90390 > frame pointer = 0x28:0xff8094b903a0 > code segment= base 0x0, limit 0xf, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags= resume, IOPL = 0 > current process = 72566 (ps) > trap number = 9 > panic: general protection fault > cpuid = 0 > KDB: stack backtrace: > #0 0x8062cf8e at kdb_backtrace+0x5e > #1 0x805facd3 at panic+0x183 > #2 0x808e6c20 at trap_fatal+0x290 > #3 0x808e715a at trap+0x10a > #4 0x808cec64 at calltrap+0x8 > #5 0x805ee034 at fill_kinfo_thread+0x54 > #6 0x805eee76 at fill_kinfo_proc+0x586 > #7 0x805f22b8 at sysctl_out_proc+0x48 > #8 0x805f26c8 at sysctl_kern_proc+0x278 > #9 0x8060473f at sysctl_root+0x14f > #10 0x80604a2a at userland_sysctl+0x14a > #11 0x80604f1a at __sysctl+0xaa > #12 0x808e62d4 at amd64_syscall+0x1f4 > #13 0x808cef5c at Xfast_syscall+0xfc > Uptime: 3d19h6m0s > Dumping 1308 out of 2028 MB:..2%..12%..21%..31%..41%..51%..62%..71%..81%..91% > Dump complete > Automatic reboot in 15 seconds - press a key on the console to abort > Rebooting... > > > The reason for the subject line is that I have another RELENG_8 system that > uses ZFS + nullfs but doesn't panic, leading me to believe that ZFS + nullfs > is not the problem. I am wondering if it is the combination of the three > that is deadly, here. > > Both RELENG_8 systems are root-on-ZFS installs. Each night there is a > separate backup script that runs and completes before the regular "periodic > daily" run. This script takes a recursive snapshot of the ZFS pool and then > mounts these snapshots via mount_nullfs to provide a coherent view of the > filesystem under /backup. The only difference between the two RELENG_8 > systems is that one uses rsync to back up /backup to another machine and the > other uses the Linux Tivoli TSM client to back up /backup to a TSM server. > After the backup is completed, a script runs that unmounts the nullfs file > systems and then destroys the ZFS snapshot. > > The first (rsync backup) RELENG_8 system does not panic. It has been running > the ZFS + nullfs rsync backup job without incident for weeks now. The second > (Tivoli TSM) RELENG_8 will reliably panic when the subsequent "periodic > daily" job runs. (It is using the 32-bit TSM 6.2.4 Linux client running > "dsmc schedule" via the linux_base-f10-10_4 package.) The actual ZFS + > nullfs Tivoli TSM backup job appears to run successfully, making me wonder if > perhaps it has some memory leak or other subtle corruption that sets up the > ensuing panic when the "periodic daily" job later gives the system a workout. > > If I can provide more information about the panic, please let me know. > Despite the message about dumping in the panic output above, when the system > reboots I get a "No core dumps found" message during boot. (I have > dumpdev="AUTO" set in /etc/rc.conf.) My swap device is on separate > partitions but is mirrored using geom_mirror as /dev/mirror/swap. Do crash > dumps to gmirror devices work on RELENG_8? See gmirror(8) man page, section NOTES. Read the full thing. > Does anyone have any idea what is to blame for the panic, or how I can fix or > work around it? Does the panic always happen when "ps" is run? That's what's shown in the above panic message. Quoting: > current process = 72566 (ps) And I'm inclined to think it does, based on the backtrace: > #5 0x805ee034 at fill_kinfo_thread+0x54 > #6 0x805eee76 at fill_kinfo_proc+0x586 > #7 0x805f22b8 at sysctl_out_proc+0x48 > #8 0x805f26c8 at sysctl_kern_proc+0x278 But if you can go through the previous panics and confirm that, it would be helpful to developers in tracking down the problem. Sorry I can't be of any more assistance than this. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: freebsd 9-stable TOP problem from around Jan 10
On Tue, Feb 14, 2012 at 03:35:01PM -0800, Julian Elischer wrote: > On 2/14/12 10:38 AM, Kevin Oberman wrote: > >On Tue, Feb 14, 2012 at 12:23 AM, Julian Elischer wrote: > >>Has anyone else seen a problem with top -H -S? > >> > >>after a short while the screen gets more and more corrupted.. > >> > >>hitting ^L or turning off S& H modes helps .. for a while. > >> > >>If this is a known fixed problem, let me know but I need to co-ordinate with > >>others > >>to upgrade the machine in question. > >Not seeing it here on 9-stable. Could it be a display issue? I am > >using gnome-terminal with TERM defined as 'xterm'. > > yeah I'm on a mac with iterm, but running through 'screen' . > > it's never been a problem before.. just since we upgraded to 9-stable. If you remove GNU screen from the picture does the problem go away? If so, I'm not surprised. :-) Make sure that when you're using GNU screen, that all shells launched "under/within" screen have TERM=screen. If they don't, then this is almost certainly the problem -- GNU screen "translates" between terminal types, meaning it translates its own terminal type ("screen") into whatever TERM is currently attached ("xterm", "iterm", whatever). See the last 4 paragraphs of my post here to understand what exactly GNU screen is doing: http://lists.freebsd.org/pipermail/freebsd-stable/2011-June/063052.html So, in general, make sure your dotfiles and so on don't mess about with the $TERM environment variable and you should generally be okay. If within GNU screen TERM=screen and you see the problem, but outside of screen you use TERM=xterm (or something else) but don't see the problem, then I would almost certainly blame GNU screen. If you're looking for something that simply keeps a terminal running in the background, try nohup or tmux. Alternately, possibly someone added a "screen" entry to /etc/termcap on RELENG_9? I don't use 9 so I have no way to confirm this, but on 8 there is no such entry. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
HEADS UP: Xen merge coming to stable/8
Hi folks, I'm planning to merge almost all of the Xen changes from FreeBSD/head into stable/8 soon. This should bring more features, stability, etc. I've attached what will be the commit message. If there are any objections, speak now. Ken -- Kenneth Merry k...@freebsd.org MFC r215818, r216405, r216437, r216448, r216956, r221827, r222975, r223059, r225343, r225704, r225705, r225706, r225707, r225709, r226029, r220647, r230183, r230587, r230916, r228526, r230879: Bring Xen support in stable/8 up to parity with head. r215818 | cperciva | 2010-11-25 08:05:21 -0700 (Thu, 25 Nov 2010) | 5 lines Rename HYPERVISOR_multicall (which performs the multicall hypercall) to _HYPERVISOR_multicall, and create a new HYPERVISOR_multicall function which invokes _HYPERVISOR_multicall and checks that the individual hypercalls all succeeded. r216405 | rwatson | 2010-12-13 05:15:46 -0700 (Mon, 13 Dec 2010) | 7 lines Add options NO_ADAPTIVE_SX to the XENHVM kernel configuration, matching its similar disabling of adaptive mutexes and rwlocks. The existing comment on why this is the case also applies to sx locks. MFC after:3 days Discussed with: attilio r216437 | gibbs | 2010-12-14 10:23:49 -0700 (Tue, 14 Dec 2010) | 2 lines Remove spurious printf left over from debugging our XenStore support. r216448 | gibbs | 2010-12-14 13:57:40 -0700 (Tue, 14 Dec 2010) | 4 lines Fix a typo in a comment. Noticed by: Attila Nagy r216956 | rwatson | 2011-01-04 07:49:54 -0700 (Tue, 04 Jan 2011) | 8 lines Make "options XENHVM" compile for i386, not just amd64 -- a largely mechanical change. This opens the door for using PV device drivers under Xen HVM on i386, as well as more general harmonisation of i386 and amd64 Xen support in FreeBSD. Reviewed by: cperciva MFC after:3 weeks r221827 | mav | 2011-05-12 21:40:16 -0600 (Thu, 12 May 2011) | 2 lines Fix msleep() usage in Xen balloon driver to not wake up on every HZ tick. r222975 | gibbs | 2011-06-10 22:59:01 -0600 (Fri, 10 Jun 2011) | 63 lines Monitor and emit events for XenStore changes to XenBus trees of the devices we manage. These changes can be due to writes we make ourselves or due to changes made by the control domain. The goal of these changes is to insure that all state transitions can be detected regardless of their source and to allow common device policies (e.g. "onlined" backend devices) to be centralized in the XenBus bus code. sys/xen/xenbus/xenbusvar.h: sys/xen/xenbus/xenbus.c: sys/xen/xenbus/xenbus_if.m: Add a new method for XenBus drivers "localend_changed". This method is invoked whenever a write is detected to a device's XenBus tree. The default implementation of this method is a no-op. sys/xen/xenbus/xenbus_if.m: sys/dev/xen/netfront/netfront.c: sys/dev/xen/blkfront/blkfront.c: sys/dev/xen/blkback/blkback.c: Change the signature of the "otherend_changed" method. This notification cannot fail, so it should return void. sys/xen/xenbus/xenbusb_back.c: Add "online" device handling to the XenBus Back Bus support code. An online backend device remains active after a front-end detaches as a reconnect is expected to occur in the near future. sys/xen/interface/io/xenbus.h: Add comment block further explaining the meaning and driver responsibilities associated with the XenBus Closed state. sys/xen/xenbus/xenbusb.c: sys/xen/xenbus/xenbusb.h: sys/xen/xenbus/xenbusb_back.c: sys/xen/xenbus/xenbusb_front.c: sys/xen/xenbus/xenbusb_if.m: o Register a XenStore watch against the local XenBus tree for all devices. o Cache the string length of the path to our local tree. o Allow the xenbus front and back drivers to hook/filter both local and otherend watch processing. o Update the device ivar version of "state" when we detect a XenStore update of that node. sys/dev/xen/control/control.c: sys/xen/xenbus/xenbus.c: sys/xen/xenbus/xenbusb.c: sys/xen/xenbus/xenbusb.h: sys/xen/xenbus/xenbusvar.h: sys/xen/xenstore/xenstorevar.h: Allow clients of the XenStore watch mechanism to attach a single uintptr_t worth of client data to the watch. This removes the need to carefully place client watch data within enclosing objects so that a cast or offsetof calculation can be used to convert from watch to enclosing object. Sponsored by: Spectra Logic Corporation MFC after:1 week r223059 | gibbs | 2011-06-13 14:36:29 -0600 (Mon, 13 Jun 2011) | 36 lines Several enhancements to the Xen block back driver. sys/dev/xen/blkback/blkback.c: o Implement front-end request coalescing. This greatly improves the performance of front-end clients that are unaware of the dynamic
Re: problems with AHCI on FreeBSD 8.2
On Feb 14, 2012, at 4:34 PM, Victor Balada Diaz wrote: > On Tue, Feb 14, 2012 at 03:09:58PM -0800, Jeremy Chadwick wrote: >> On Tue, Feb 14, 2012 at 11:15:27PM +0100, Victor Balada Diaz wrote: >>> On Tue, Feb 14, 2012 at 06:17:19PM +0100, Harald Schmalzbauer wrote: schrieb Jeremy Chadwick am 14.02.2012 17:50 (localtime): > On Tue, Feb 14, 2012 at 04:55:10PM +0100, Claudius Herder wrote: >> Hello, >> >> I have got a quite similar problem with AHCI on FreeBSD 8.2 and it still >> persists on FreeBSD 9.0 release. >> >> Switching from ahci to ataahci resolved the problem for me too. >> >> I'm using gmirror for swap, system is on a zpool and the problem first >> occurred during a zpool scrub, but it is easily reproducible with dd. >> >> The timeouts only occur when writing to disks, dd if=/dev/ada{0|1} >> of=/dev/null is not an issue. >> Sometimes I need to power off the server because after a reboot one disk >> is still missing. >> >> I really would like to help in this issue, so let me know if you need >> any more information. > I find it interesting that, at least so far, the only people reporting > problems of this type with the ahci.ko driver are people using Samsung > disks. The only difference is that your models are F1s while the OPs > are F2s. I saw such timeouts long ago and mav@ had a look at my postings and he mentioned it could be a NCQ problem. I suspected the disks firmware. I never tracked it down further, because after replacing the Samsung (F3 in that case) disks with hitachi ones solved all my problems and gave a big performance kick as well (with zfs). You can find the discussion here: http://lists.freebsd.org/pipermail/freebsd-stable/2010-February/055374.html >>> >>> You gave me a good idea: try to disable NCQ and see if that's the fault. So >>> i went and applied the attached patch. After it, i can no longer reproduce >>> the issue with ahci driver. >>> >>> I know this is not a solution because it disables NCQ at controller level >>> instead of disk level, but at least we know for sure where the problem is. >>> >>> I think the solution would be to add a new quirk ADA_Q_NONCQ in >>> sys/cam/ata/ata_da.c. >>> Quirks infraestructure is already built, so adding a new quirk for this >>> seems >>> easy. >>> >>> Is someone interested? Do you think there is a better solution? >>> >>> If someone is interested i can build a patch to add ADA_Q_NONCQ quirk and >>> add my drives >>> to it. >> >> I took a stab at this, but I don't feel confident this is the proper >> solution/method. I worry there's some sort of chicken-or-the-egg >> condition here (quirk setup/matching comes *after* SATA capabilities >> detection), or that it makes the code messier. Need mav@'s >> recommendations on this. >> >> Below is for RELENG_8. I should note I haven't tested if this works, or >> even compiles -- normally I don't provide such patches without testing >> so I apologise in advance / user beware. > > You're amazingly fast. Thanks for all your help :) > > You start applying the quirks before > >snprintf(announce_buf, sizeof(announce_buf), >"kern.cam.ada.%d.quirks", periph->unit_number); >quirks = softc->quirks; >TUNABLE_INT_FETCH(announce_buf, &quirks); > > So you're breaking quirk setting at boot time. > > See my attached patch. I can confirm it works for me. > > Regards. > I don't think that disabling NCQ entirely is the right solution. It's a tag starvation issue in the firmware, not a complete failure, and it can be dealt with in the CAM XPT scheduler fairly efficiently. Alexander and I talked about this recently, and though we differ on the details, a tag hack is not in order, IMHO. In the short term, try just using "cam control tags ada0 -N 1" to limit the concurrent commands to 1. Scott ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: problems with AHCI on FreeBSD 8.2
On Wed, Feb 15, 2012 at 12:34:20AM +0100, Victor Balada Diaz wrote: > On Tue, Feb 14, 2012 at 03:09:58PM -0800, Jeremy Chadwick wrote: > > On Tue, Feb 14, 2012 at 11:15:27PM +0100, Victor Balada Diaz wrote: > > > On Tue, Feb 14, 2012 at 06:17:19PM +0100, Harald Schmalzbauer wrote: > > > > schrieb Jeremy Chadwick am 14.02.2012 17:50 (localtime): > > > > > On Tue, Feb 14, 2012 at 04:55:10PM +0100, Claudius Herder wrote: > > > > >> Hello, > > > > >> > > > > >> I have got a quite similar problem with AHCI on FreeBSD 8.2 and it > > > > >> still > > > > >> persists on FreeBSD 9.0 release. > > > > >> > > > > >> Switching from ahci to ataahci resolved the problem for me too. > > > > >> > > > > >> I'm using gmirror for swap, system is on a zpool and the problem > > > > >> first > > > > >> occurred during a zpool scrub, but it is easily reproducible with dd. > > > > >> > > > > >> The timeouts only occur when writing to disks, dd if=/dev/ada{0|1} > > > > >> of=/dev/null is not an issue. > > > > >> Sometimes I need to power off the server because after a reboot one > > > > >> disk > > > > >> is still missing. > > > > >> > > > > >> I really would like to help in this issue, so let me know if you need > > > > >> any more information. > > > > > I find it interesting that, at least so far, the only people reporting > > > > > problems of this type with the ahci.ko driver are people using Samsung > > > > > disks. The only difference is that your models are F1s while the OPs > > > > > are F2s. > > > > > > > > I saw such timeouts long ago and mav@ had a look at my postings and he > > > > mentioned it could be a NCQ problem. > > > > I suspected the disks firmware. > > > > I never tracked it down further, because after replacing the Samsung (F3 > > > > in that case) disks with hitachi ones solved all my problems and gave a > > > > big performance kick as well (with zfs). > > > > You can find the discussion here: > > > > http://lists.freebsd.org/pipermail/freebsd-stable/2010-February/055374.html > > > > > > > > > > You gave me a good idea: try to disable NCQ and see if that's the fault. > > > So > > > i went and applied the attached patch. After it, i can no longer reproduce > > > the issue with ahci driver. > > > > > > I know this is not a solution because it disables NCQ at controller level > > > instead of disk level, but at least we know for sure where the problem is. > > > > > > I think the solution would be to add a new quirk ADA_Q_NONCQ in > > > sys/cam/ata/ata_da.c. > > > Quirks infraestructure is already built, so adding a new quirk for this > > > seems > > > easy. > > > > > > Is someone interested? Do you think there is a better solution? > > > > > > If someone is interested i can build a patch to add ADA_Q_NONCQ quirk and > > > add my drives > > > to it. > > > > I took a stab at this, but I don't feel confident this is the proper > > solution/method. I worry there's some sort of chicken-or-the-egg > > condition here (quirk setup/matching comes *after* SATA capabilities > > detection), or that it makes the code messier. Need mav@'s > > recommendations on this. > > > > Below is for RELENG_8. I should note I haven't tested if this works, or > > even compiles -- normally I don't provide such patches without testing > > so I apologise in advance / user beware. > > You're amazingly fast. Thanks for all your help :) > > You start applying the quirks before > > snprintf(announce_buf, sizeof(announce_buf), > "kern.cam.ada.%d.quirks", periph->unit_number); > quirks = softc->quirks; > TUNABLE_INT_FETCH(announce_buf, &quirks); > > So you're breaking quirk setting at boot time. I'm too tired to quite understand (in full) what's wrong with my patch, but I think you're referring to situations where someone would have kern.cam.ada.X.quirks set in loader.conf? If so, I believe that same situation would happen presently if someone set kern.cam.ada.X.quirks in their loader.conf to a value that did not contain bit #0 set to 1, and used one of the 4K sector disks listed in ada_quirk_table -- what's in loader.conf looks like it would overwrite whatever the kernel code bits chose automatically: 910 match = cam_quirkmatch((caddr_t)&cgd->ident_data, 911(caddr_t)ada_quirk_table, 912 sizeof(ada_quirk_table)/sizeof(*ada_quirk_table), 913sizeof(*ada_quirk_table), ata_identify_match); 914 if (match != NULL) 915 softc->quirks = ((struct ada_quirk_entry *)match)->quirks; 916 else 917 softc->quirks = ADA_Q_NONE; ... 931 snprintf(announce_buf, sizeof(announce_buf), 932 "kern.cam.ada.%d.quirks", periph->unit_number); 933 quirks = softc->quirks; 934 TUNABLE_INT_FETCH(announce_buf, &quirks); 935 softc->quirks = quirks; I read this to mean: Lines 910-917 -- if there's a device ID st
ZFS + nullfs + Linuxulator = panic?
I have a problem with RELENG_8 (FreeBSD/amd64 running a GENERIC kernel, last built 2012-02-08). It will panic during the daily periodic scripts that run at 3am. Here is the most recent panic message: Fatal trap 9: general protection fault while in kernel mode cpuid = 0; apic id = 00 instruction pointer = 0x20:0x8069d266 stack pointer = 0x28:0xff8094b90390 frame pointer = 0x28:0xff8094b903a0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= resume, IOPL = 0 current process = 72566 (ps) trap number = 9 panic: general protection fault cpuid = 0 KDB: stack backtrace: #0 0x8062cf8e at kdb_backtrace+0x5e #1 0x805facd3 at panic+0x183 #2 0x808e6c20 at trap_fatal+0x290 #3 0x808e715a at trap+0x10a #4 0x808cec64 at calltrap+0x8 #5 0x805ee034 at fill_kinfo_thread+0x54 #6 0x805eee76 at fill_kinfo_proc+0x586 #7 0x805f22b8 at sysctl_out_proc+0x48 #8 0x805f26c8 at sysctl_kern_proc+0x278 #9 0x8060473f at sysctl_root+0x14f #10 0x80604a2a at userland_sysctl+0x14a #11 0x80604f1a at __sysctl+0xaa #12 0x808e62d4 at amd64_syscall+0x1f4 #13 0x808cef5c at Xfast_syscall+0xfc Uptime: 3d19h6m0s Dumping 1308 out of 2028 MB:..2%..12%..21%..31%..41%..51%..62%..71%..81%..91% Dump complete Automatic reboot in 15 seconds - press a key on the console to abort Rebooting... The reason for the subject line is that I have another RELENG_8 system that uses ZFS + nullfs but doesn't panic, leading me to believe that ZFS + nullfs is not the problem. I am wondering if it is the combination of the three that is deadly, here. Both RELENG_8 systems are root-on-ZFS installs. Each night there is a separate backup script that runs and completes before the regular "periodic daily" run. This script takes a recursive snapshot of the ZFS pool and then mounts these snapshots via mount_nullfs to provide a coherent view of the filesystem under /backup. The only difference between the two RELENG_8 systems is that one uses rsync to back up /backup to another machine and the other uses the Linux Tivoli TSM client to back up /backup to a TSM server. After the backup is completed, a script runs that unmounts the nullfs file systems and then destroys the ZFS snapshot. The first (rsync backup) RELENG_8 system does not panic. It has been running the ZFS + nullfs rsync backup job without incident for weeks now. The second (Tivoli TSM) RELENG_8 will reliably panic when the subsequent "periodic daily" job runs. (It is using the 32-bit TSM 6.2.4 Linux client running "dsmc schedule" via the linux_base-f10-10_4 package.) The actual ZFS + nullfs Tivoli TSM backup job appears to run successfully, making me wonder if perhaps it has some memory leak or other subtle corruption that sets up the ensuing panic when the "periodic daily" job later gives the system a workout. If I can provide more information about the panic, please let me know. Despite the message about dumping in the panic output above, when the system reboots I get a "No core dumps found" message during boot. (I have dumpdev="AUTO" set in /etc/rc.conf.) My swap device is on separate partitions but is mirrored using geom_mirror as /dev/mirror/swap. Do crash dumps to gmirror devices work on RELENG_8? Does anyone have any idea what is to blame for the panic, or how I can fix or work around it? Cheers, Paul. PS: The uptime of three days in the panic message is because I disabled the Tivoli TSM backup job on Friday so it would not run over the weekend. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
RE: New BSD Installer
> -Original Message- > From: owner-freebsd-sta...@freebsd.org [mailto:owner-freebsd- > sta...@freebsd.org] On Behalf Of Mike Andrews > Sent: Tuesday, February 14, 2012 1:11 PM > To: freebsd-stable@freebsd.org > Subject: Re: New BSD Installer > > On 2/14/2012 3:05 PM, Devin Teske wrote: > > Please don't get rid of fdisk or bsdlabel as they are (and forever will be) > > required to do things like: > > > > 1. scripted formatting of a thumb drive > > > > 2. automated probing of disk information (fdisk -p) > > > > 3. Other tasks that are not suitably handled by curses-based utilities > > > > For example, the following command will create a second Windows partition on > a > > thumb drive without user interaction: > > > > echo "p 2 0x0c * *" | fdisk -f - /dev/da0 > > > > If you take away fdisk, how am I supposed to achieve the above? > > /sbin/gpart add -t 12 -i 2 da0 > I stand corrected. Ok, remove at-will but not before 10.0 please. Looking for 9.x to be the transitional phase. -- Devin _ The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: problems with AHCI on FreeBSD 8.2
On Tue, Feb 14, 2012 at 03:09:58PM -0800, Jeremy Chadwick wrote: > On Tue, Feb 14, 2012 at 11:15:27PM +0100, Victor Balada Diaz wrote: > > On Tue, Feb 14, 2012 at 06:17:19PM +0100, Harald Schmalzbauer wrote: > > > schrieb Jeremy Chadwick am 14.02.2012 17:50 (localtime): > > > > On Tue, Feb 14, 2012 at 04:55:10PM +0100, Claudius Herder wrote: > > > >> Hello, > > > >> > > > >> I have got a quite similar problem with AHCI on FreeBSD 8.2 and it > > > >> still > > > >> persists on FreeBSD 9.0 release. > > > >> > > > >> Switching from ahci to ataahci resolved the problem for me too. > > > >> > > > >> I'm using gmirror for swap, system is on a zpool and the problem first > > > >> occurred during a zpool scrub, but it is easily reproducible with dd. > > > >> > > > >> The timeouts only occur when writing to disks, dd if=/dev/ada{0|1} > > > >> of=/dev/null is not an issue. > > > >> Sometimes I need to power off the server because after a reboot one > > > >> disk > > > >> is still missing. > > > >> > > > >> I really would like to help in this issue, so let me know if you need > > > >> any more information. > > > > I find it interesting that, at least so far, the only people reporting > > > > problems of this type with the ahci.ko driver are people using Samsung > > > > disks. The only difference is that your models are F1s while the OPs > > > > are F2s. > > > > > > I saw such timeouts long ago and mav@ had a look at my postings and he > > > mentioned it could be a NCQ problem. > > > I suspected the disks firmware. > > > I never tracked it down further, because after replacing the Samsung (F3 > > > in that case) disks with hitachi ones solved all my problems and gave a > > > big performance kick as well (with zfs). > > > You can find the discussion here: > > > http://lists.freebsd.org/pipermail/freebsd-stable/2010-February/055374.html > > > > > > > You gave me a good idea: try to disable NCQ and see if that's the fault. So > > i went and applied the attached patch. After it, i can no longer reproduce > > the issue with ahci driver. > > > > I know this is not a solution because it disables NCQ at controller level > > instead of disk level, but at least we know for sure where the problem is. > > > > I think the solution would be to add a new quirk ADA_Q_NONCQ in > > sys/cam/ata/ata_da.c. > > Quirks infraestructure is already built, so adding a new quirk for this > > seems > > easy. > > > > Is someone interested? Do you think there is a better solution? > > > > If someone is interested i can build a patch to add ADA_Q_NONCQ quirk and > > add my drives > > to it. > > I took a stab at this, but I don't feel confident this is the proper > solution/method. I worry there's some sort of chicken-or-the-egg > condition here (quirk setup/matching comes *after* SATA capabilities > detection), or that it makes the code messier. Need mav@'s > recommendations on this. > > Below is for RELENG_8. I should note I haven't tested if this works, or > even compiles -- normally I don't provide such patches without testing > so I apologise in advance / user beware. You're amazingly fast. Thanks for all your help :) You start applying the quirks before snprintf(announce_buf, sizeof(announce_buf), "kern.cam.ada.%d.quirks", periph->unit_number); quirks = softc->quirks; TUNABLE_INT_FETCH(announce_buf, &quirks); So you're breaking quirk setting at boot time. See my attached patch. I can confirm it works for me. Regards. -- La prueba más fehaciente de que existe vida inteligente en otros planetas, es que no han intentado contactar con nosotros. --- ata_da.c 2012-02-14 22:17:54.0 +0100 +++ ata_da.c 2012-02-14 22:58:05.0 +0100 @@ -91,6 +91,7 @@ typedef enum { ADA_Q_NONE = 0x00, ADA_Q_4K = 0x01, + ADA_Q_NONCQ = 0x02, } ada_quirks; typedef enum { @@ -162,6 +163,14 @@ /*quirks*/ADA_Q_4K }, { + /* + * Samsung have NCQ broken: + * http://lists.freebsd.org/pipermail/freebsd-stable/2012-February/066168.html + */ + { T_DIRECT, SIP_MEDIA_FIXED, "*", "SAMSUNG HD154UI*", "*" }, + /*quirks*/ADA_Q_NONCQ + }, + { /* Samsung Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "SAMSUNG HD155UI*", "*" }, /*quirks*/ADA_Q_4K @@ -967,6 +976,10 @@ softc->disk->d_maxsize = maxio; softc->disk->d_unit = periph->unit_number; softc->disk->d_flags = 0; + /* Disable NCQ if needed */ + if (softc->flags & ADA_FLAG_CAN_NCQ && + softc->quirks & ADA_Q_NONCQ) + softc->flags ^= ADA_FLAG_CAN_NCQ; if (softc->flags & ADA_FLAG_CAN_FLUSHCACHE) softc->disk->d_flags |= DISKFLAG_CANFLUSHCACHE; if ((softc->flags & ADA_FLAG_CAN_TRIM) || ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
RE: Why won't 8.2 umount -f?
> -Original Message- > From: owner-freebsd...@freebsd.org [mailto:owner-freebsd...@freebsd.org] > On Behalf Of Doug Barton > Sent: Tuesday, February 14, 2012 1:05 PM > To: Rick Macklem > Cc: freebsd...@freebsd.org; freebsd-stable@FreeBSD.org > Subject: Re: Why won't 8.2 umount -f? > > On 02/14/2012 08:39, Rick Macklem wrote: > > I took a look and they seem to have been MFC'd. > > That's awesome! Thanks for your time on this. I guess we've got some > upgrading to do. > +1 Awaiting 8.3 with bated breath! -- Devin _ The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: freebsd 9-stable TOP problem from around Jan 10
On 2/14/12 10:38 AM, Kevin Oberman wrote: On Tue, Feb 14, 2012 at 12:23 AM, Julian Elischer wrote: Has anyone else seen a problem with top -H -S? after a short while the screen gets more and more corrupted.. hitting ^L or turning off S& H modes helps .. for a while. If this is a known fixed problem, let me know but I need to co-ordinate with others to upgrade the machine in question. Not seeing it here on 9-stable. Could it be a display issue? I am using gnome-terminal with TERM defined as 'xterm'. yeah I'm on a mac with iterm, but running through 'screen' . it's never been a problem before.. just since we upgraded to 9-stable. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: problems with AHCI on FreeBSD 8.2
On Tue, Feb 14, 2012 at 11:15:27PM +0100, Victor Balada Diaz wrote: > On Tue, Feb 14, 2012 at 06:17:19PM +0100, Harald Schmalzbauer wrote: > > schrieb Jeremy Chadwick am 14.02.2012 17:50 (localtime): > > > On Tue, Feb 14, 2012 at 04:55:10PM +0100, Claudius Herder wrote: > > >> Hello, > > >> > > >> I have got a quite similar problem with AHCI on FreeBSD 8.2 and it still > > >> persists on FreeBSD 9.0 release. > > >> > > >> Switching from ahci to ataahci resolved the problem for me too. > > >> > > >> I'm using gmirror for swap, system is on a zpool and the problem first > > >> occurred during a zpool scrub, but it is easily reproducible with dd. > > >> > > >> The timeouts only occur when writing to disks, dd if=/dev/ada{0|1} > > >> of=/dev/null is not an issue. > > >> Sometimes I need to power off the server because after a reboot one disk > > >> is still missing. > > >> > > >> I really would like to help in this issue, so let me know if you need > > >> any more information. > > > I find it interesting that, at least so far, the only people reporting > > > problems of this type with the ahci.ko driver are people using Samsung > > > disks. The only difference is that your models are F1s while the OPs > > > are F2s. > > > > I saw such timeouts long ago and mav@ had a look at my postings and he > > mentioned it could be a NCQ problem. > > I suspected the disks firmware. > > I never tracked it down further, because after replacing the Samsung (F3 > > in that case) disks with hitachi ones solved all my problems and gave a > > big performance kick as well (with zfs). > > You can find the discussion here: > > http://lists.freebsd.org/pipermail/freebsd-stable/2010-February/055374.html > > > > You gave me a good idea: try to disable NCQ and see if that's the fault. So > i went and applied the attached patch. After it, i can no longer reproduce > the issue with ahci driver. > > I know this is not a solution because it disables NCQ at controller level > instead of disk level, but at least we know for sure where the problem is. > > I think the solution would be to add a new quirk ADA_Q_NONCQ in > sys/cam/ata/ata_da.c. > Quirks infraestructure is already built, so adding a new quirk for this seems > easy. > > Is someone interested? Do you think there is a better solution? > > If someone is interested i can build a patch to add ADA_Q_NONCQ quirk and add > my drives > to it. I took a stab at this, but I don't feel confident this is the proper solution/method. I worry there's some sort of chicken-or-the-egg condition here (quirk setup/matching comes *after* SATA capabilities detection), or that it makes the code messier. Need mav@'s recommendations on this. Below is for RELENG_8. I should note I haven't tested if this works, or even compiles -- normally I don't provide such patches without testing so I apologise in advance / user beware. -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | diff -ruN /usr/src/sys/cam/ata/ata_da.c src/sys/cam/ata/ata_da.c --- /usr/src/sys/cam/ata/ata_da.c 2012-02-10 17:22:25.0 -0800 +++ src/sys/cam/ata/ata_da.c2012-02-14 15:07:07.988814133 -0800 @@ -90,7 +90,8 @@ typedef enum { ADA_Q_NONE = 0x00, - ADA_Q_4K= 0x01, + ADA_Q_4K= 0x01, /* 4k sectors */ + ADA_Q_NONCQ = 0x02, /* device has flaky NCQ support */ } ada_quirks; typedef enum { @@ -162,6 +163,11 @@ /*quirks*/ADA_Q_4K }, { + /* Samsung Spinpoint F2 EG (EcoGreen) drives */ + { T_DIRECT, SIP_MEDIA_FIXED, "*", "SAMSUNG HD154UI*", "*" }, + /*quirks*/ADA_Q_NONCQ, + }, + { /* Samsung Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "SAMSUNG HD155UI*", "*" }, /*quirks*/ADA_Q_4K @@ -887,9 +893,6 @@ softc->flags |= ADA_FLAG_CAN_FLUSHCACHE; if (cgd->ident_data.support.command1 & ATA_SUPPORT_POWERMGT) softc->flags |= ADA_FLAG_CAN_POWERMGT; - if (cgd->ident_data.satacapabilities & ATA_SUPPORT_NCQ && - (cgd->inq_flags & SID_DMA) && (cgd->inq_flags & SID_CmdQue)) - softc->flags |= ADA_FLAG_CAN_NCQ; if (cgd->ident_data.support_dsm & ATA_SUPPORT_DSM_TRIM) { softc->flags |= ADA_FLAG_CAN_TRIM; softc->trim_max_ranges = TRIM_MAX_RANGES; @@ -916,6 +919,15 @@ else softc->quirks = ADA_Q_NONE; + /* +* Do not enable NCQ for devices which have the ADA_Q_NONCQ quirk. +*/ + if (!(softc->quirks & ADA_Q_NONCQ)) { + if (cgd->ident_data.satacapabilities & ATA_SUPPORT_NCQ && + (cgd->i
Re: problems with AHCI on FreeBSD 8.2
Thank you again Jeremy, sure it helps! On Tue, Feb 14, 2012 at 9:31 PM, Jeremy Chadwick wrote: > On Tue, Feb 14, 2012 at 09:19:02PM +0100, Oscar Prieto wrote: >> Thank you Jeremy, i'm already checking your links. >> >> When i installed smartd i configured a daily short test and a weekly >> long one for all the drives while the machine remains mostly unused, >> never thought it could be a problem reading the documentation and info >> around. >> >> # /usr/local/etc/smartd.conf >> /dev/ada0 -a -o on -S on -s (S/../.././03|L/../../2/07) >> /dev/ada1 -a -o on -S on -s (S/../.././04|L/../../3/07) >> /dev/ada2 -a -o on -S on -s (S/../.././05|L/../../4/07) >> /dev/ada3 -a -o on -S on -s (S/../.././06|L/../../5/07) > > The problem is that, quite honestly, these do you zero good. All it does > is make a mess (per se) of the SMART self-test log. > > Take for example your situation with ada3: smartd(8) told you that the > number of pending sectors increased to 5, and uncorrected increased to > 1. That's really all you need to know at that point. If you want to > know the LBA numbers which are problematic, you can manually intervene. > > The point is: the drive itself is going to notice problematic or bad > sectors quicker than periodic short or long or surface scan tests will. > Let the drive do its thing normally and only use SMART tests when > there's indication something is wrong. > >> I'll remove the checks, do you advice for removing the daemon altogether? > > smartd(8) is useful because it keeps track of attributes which change in > value and logs data to syslog (if I remember right), thus you have an > exact time/date when an attribute changed. This is especially useful > for things pertaining to sector/physical media problems. > > As such, I tend to recommend folks using smartd(8) properly tune their > smartd.conf to only monitor specific attributes. This varies from drive > to drive, but the key ones are things like attributes 5, 10, 11, 192, > 193, 194 (if you want temperature logging), 196, 197, 198, 199, and 200. > I'm speaking strictly for Western Digital disks here. > > The stock defaults, if I remember right, are to "monitor everything", > which really doesn't work well given that so many vendors encode their > RAW_VALUE fields in proprietary/vendor-specific formats. People will > often monitor things like the Hardware_ECC_Recovered attribute and start > "freaking out" once day when the value goes from 0 to 838938239 or > something larger. Attribute data formats are not part of the ATA > standard, so vendors choose to encode them. Plus, not many admins that > I've run into (honest) know what that attribute actually means > disk-wise (hint: it's 100% normal for sector ECC to happen at all times; > magnetic media is not perfect, that's what the per-sector ECC section is > for!) > > However: people don't understand what SMART attribute acquisition > actually does behind the scenes -- it results in the disk having to read > from the HPA area (not user accessible or within LBA regions), which > means seeking + moving the arms to an area, reading, then reporting all > of this back. Thus, it impacts I/O performance. This is why I don't > use smartd(8) on any of our systems. But if I was to use it? I would > have it poll maybe every 120 minutes, rather than every 30. It all > depends on the system/load/etc.. I've seen people poll every 5 minutes > (I think they're absolutely crazy/paranoid). Their systems, their > problem. :-) > > Hope this helps. > > -- > | Jeremy Chadwick j...@parodius.com | > | Parodius Networking http://www.parodius.com/ | > | UNIX Systems Administrator Mountain View, CA, US | > | Making life hard for others since 1977. PGP 4BD6C0CB | > ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: problems with AHCI on FreeBSD 8.2
On Tue, Feb 14, 2012 at 06:17:19PM +0100, Harald Schmalzbauer wrote: > schrieb Jeremy Chadwick am 14.02.2012 17:50 (localtime): > > On Tue, Feb 14, 2012 at 04:55:10PM +0100, Claudius Herder wrote: > >> Hello, > >> > >> I have got a quite similar problem with AHCI on FreeBSD 8.2 and it still > >> persists on FreeBSD 9.0 release. > >> > >> Switching from ahci to ataahci resolved the problem for me too. > >> > >> I'm using gmirror for swap, system is on a zpool and the problem first > >> occurred during a zpool scrub, but it is easily reproducible with dd. > >> > >> The timeouts only occur when writing to disks, dd if=/dev/ada{0|1} > >> of=/dev/null is not an issue. > >> Sometimes I need to power off the server because after a reboot one disk > >> is still missing. > >> > >> I really would like to help in this issue, so let me know if you need > >> any more information. > > I find it interesting that, at least so far, the only people reporting > > problems of this type with the ahci.ko driver are people using Samsung > > disks. The only difference is that your models are F1s while the OPs > > are F2s. > > I saw such timeouts long ago and mav@ had a look at my postings and he > mentioned it could be a NCQ problem. > I suspected the disks firmware. > I never tracked it down further, because after replacing the Samsung (F3 > in that case) disks with hitachi ones solved all my problems and gave a > big performance kick as well (with zfs). > You can find the discussion here: > http://lists.freebsd.org/pipermail/freebsd-stable/2010-February/055374.html > You gave me a good idea: try to disable NCQ and see if that's the fault. So i went and applied the attached patch. After it, i can no longer reproduce the issue with ahci driver. I know this is not a solution because it disables NCQ at controller level instead of disk level, but at least we know for sure where the problem is. I think the solution would be to add a new quirk ADA_Q_NONCQ in sys/cam/ata/ata_da.c. Quirks infraestructure is already built, so adding a new quirk for this seems easy. Is someone interested? Do you think there is a better solution? If someone is interested i can build a patch to add ADA_Q_NONCQ quirk and add my drives to it. Regards. -- La prueba más fehaciente de que existe vida inteligente en otros planetas, es que no han intentado contactar con nosotros. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: CARP carpdev
On 02/14/12 17:33, Freddie Cash wrote: On Tue, Feb 14, 2012 at 8:56 AM, Hugo Silva wrote: Looks like there's been conversations about porting this to FreeBSD since at least 2007. Are there any plans to have ifconfig carpdev available in 9.0-STABLE? CARP support has been redone in 10-CURRENT, removing the whole "carp0" pseudo-interface support, and just enabling the CARP protocol on the existing network interfaces. This includes the equivalent of "carpdev" support. Search the -current archives for more information, CFT, and so on. I don't recall seeing anything about specific plans to MFC to stable/9, but could be mis-remembering things. http://svnweb.freebsd.org/base?view=revision&revision=228571 The single IP limitation may be a problem in some locations.. Did not find anything about a possible MFC either. glebius@ is cc'd, perhaps he can add something, but based on http://svn.freebsd.org/base/stable/9/UPDATING, I don't think it's been MFCd (there's a primer for the new carp in current's UPDATING)\ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: LSI supported mps(4) driver in stable/9 and stable/8
According to Kenneth D. Merry: > So it is perfectly fine to run the driver in stable/9 or stable/8 without > the CAM changes. Excellent, thank you Ken. -- Ollivier ROBERT -=- FreeBSD: The Power to Serve! -=- robe...@keltia.freenix.fr In memoriam to Ondine : http://ondine.keltia.net/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New BSD Installer
On 2/14/2012 3:05 PM, Devin Teske wrote: Please don't get rid of fdisk or bsdlabel as they are (and forever will be) required to do things like: 1. scripted formatting of a thumb drive 2. automated probing of disk information (fdisk -p) 3. Other tasks that are not suitably handled by curses-based utilities For example, the following command will create a second Windows partition on a thumb drive without user interaction: echo "p 2 0x0c * *" | fdisk -f - /dev/da0 If you take away fdisk, how am I supposed to achieve the above? /sbin/gpart add -t 12 -i 2 da0 (Untested, but that should work...) gpart is very scriptable, and still handles MBR and bsdlabel partitions if you need to work with removable media or volumes that will never be larger than 2 TB. "gpart list" and "gpart show" would get you all the machine-parsable stuff you'd ever need. The 2 TB limit is *the* reason to move from MBR+bsdlabel to GPT though. Even without RAID, 3 TB disks exist already. :) With FreeBSD's boot code, you don't even need an EFI-capable machine to boot from a GPT-partitioned device. For non-removable media, it's time to move on. Really. :) Even on smaller 250 GB disks, I'm using GPT just because there's no reason not to... it's just cleaner and it was easier to write gpart scripts than it was to script fdisk/bsdlabel scripts anyway. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Why won't 8.2 umount -f?
On 02/14/2012 08:39, Rick Macklem wrote: > I took a look and they seem to have been MFC'd. That's awesome! Thanks for your time on this. I guess we've got some upgrading to do. Doug -- It's always a long day; 86400 doesn't fit into a short. Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: problems with AHCI on FreeBSD 8.2
On Tue, Feb 14, 2012 at 09:19:02PM +0100, Oscar Prieto wrote: > Thank you Jeremy, i'm already checking your links. > > When i installed smartd i configured a daily short test and a weekly > long one for all the drives while the machine remains mostly unused, > never thought it could be a problem reading the documentation and info > around. > > # /usr/local/etc/smartd.conf > /dev/ada0 -a -o on -S on -s (S/../.././03|L/../../2/07) > /dev/ada1 -a -o on -S on -s (S/../.././04|L/../../3/07) > /dev/ada2 -a -o on -S on -s (S/../.././05|L/../../4/07) > /dev/ada3 -a -o on -S on -s (S/../.././06|L/../../5/07) The problem is that, quite honestly, these do you zero good. All it does is make a mess (per se) of the SMART self-test log. Take for example your situation with ada3: smartd(8) told you that the number of pending sectors increased to 5, and uncorrected increased to 1. That's really all you need to know at that point. If you want to know the LBA numbers which are problematic, you can manually intervene. The point is: the drive itself is going to notice problematic or bad sectors quicker than periodic short or long or surface scan tests will. Let the drive do its thing normally and only use SMART tests when there's indication something is wrong. > I'll remove the checks, do you advice for removing the daemon altogether? smartd(8) is useful because it keeps track of attributes which change in value and logs data to syslog (if I remember right), thus you have an exact time/date when an attribute changed. This is especially useful for things pertaining to sector/physical media problems. As such, I tend to recommend folks using smartd(8) properly tune their smartd.conf to only monitor specific attributes. This varies from drive to drive, but the key ones are things like attributes 5, 10, 11, 192, 193, 194 (if you want temperature logging), 196, 197, 198, 199, and 200. I'm speaking strictly for Western Digital disks here. The stock defaults, if I remember right, are to "monitor everything", which really doesn't work well given that so many vendors encode their RAW_VALUE fields in proprietary/vendor-specific formats. People will often monitor things like the Hardware_ECC_Recovered attribute and start "freaking out" once day when the value goes from 0 to 838938239 or something larger. Attribute data formats are not part of the ATA standard, so vendors choose to encode them. Plus, not many admins that I've run into (honest) know what that attribute actually means disk-wise (hint: it's 100% normal for sector ECC to happen at all times; magnetic media is not perfect, that's what the per-sector ECC section is for!) However: people don't understand what SMART attribute acquisition actually does behind the scenes -- it results in the disk having to read from the HPA area (not user accessible or within LBA regions), which means seeking + moving the arms to an area, reading, then reporting all of this back. Thus, it impacts I/O performance. This is why I don't use smartd(8) on any of our systems. But if I was to use it? I would have it poll maybe every 120 minutes, rather than every 30. It all depends on the system/load/etc.. I've seen people poll every 5 minutes (I think they're absolutely crazy/paranoid). Their systems, their problem. :-) Hope this helps. -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: problems with AHCI on FreeBSD 8.2
Thank you Jeremy, i'm already checking your links. When i installed smartd i configured a daily short test and a weekly long one for all the drives while the machine remains mostly unused, never thought it could be a problem reading the documentation and info around. # /usr/local/etc/smartd.conf /dev/ada0 -a -o on -S on -s (S/../.././03|L/../../2/07) /dev/ada1 -a -o on -S on -s (S/../.././04|L/../../3/07) /dev/ada2 -a -o on -S on -s (S/../.././05|L/../../4/07) /dev/ada3 -a -o on -S on -s (S/../.././06|L/../../5/07) I'll remove the checks, do you advice for removing the daemon altogether? On Tue, Feb 14, 2012 at 8:51 PM, Martin Sugioarto wrote: > Am Tue, 14 Feb 2012 20:24:32 +0100 > schrieb Harald Schmalzbauer : > >> I guess it's always the firmware of the EcoGreen models which cause >> these problems. Your drive isn't EG... >> I don't remember exactly the different model numbers, but I'm sure >> they were all EcoGreen. The lower power consumption was the reason to >> choose these specific drives (different capacities and F2/F3 series >> tried), with acceptable performance loss - I thought. But it turned >> out that EcoGreen and NCQ as well as RAIDZ demands dont' fit >> together... > > Hi, > > I intentionally did not buy any Eco or Green model because I don't like > them (Load_Cycle_Count bugs and so on). I realized, I like to use 1 Watt > more power but have the performance doubled. > > -- > Martin > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New BSD Installer
On Tue, Feb 14, 2012 at 12:05:31PM -0800, Devin Teske wrote: > Please don't get rid of fdisk or bsdlabel as they are (and forever will be) > required to do things like: > > 1. scripted formatting of a thumb drive Can't this be done with gpart(8)? There are scripts all over the web and on the lists here showing people using it for that purpose. It doesn't require use of GPT either. > 2. automated probing of disk information (fdisk -p) Can't this be accomplished with "gpart list"? Yes I know the man page doesn't appear to have it documented, but it's there. Furthermore, fdisk -p shows silly things like C/H/S nomenclature; do you really use this? Do you have boards which don't support even the most basic 28-bit LBA addressing? > 3. Other tasks that are not suitably handled by curses-based utilities > > ... > > For example, the following command will create a second Windows partition on a > thumb drive without user interaction: > > echo "p 2 0x0c * *" | fdisk -f - /dev/da0 > > If you take away fdisk, how am I supposed to achieve the above? Again: gpart(8). And before you complain: yes, I am in full agreement that introduction of gpart into the fray should have probably been "more public". The syntax of the gpart commands takes some getting used to as well (some things are hardly intuitive, but eventually make sense once you see them in use). I'm happy to use gpart for scripting, while fdisk/bsdlabel are like pulling teeth. That said, like others, I would be thrilled to see fdisk and bsdlabel/disklabel disappear. However, for that to happen, I really expect gpart to be better documented. Hell, all of the GEOM-based g* utilities should be implemented slightly... differently. It's hard to explain what I mean by this. Play with the geom(8) command sometime to see what I mean. "geom list" says to use "geom list list", etc.. Once you delve into the code to see how it all works it then starts making more sense why the utilities behave this way, but it's completely and entirely non-intuitive to anyone not already familiar with it. -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
RE: New BSD Installer
> -Original Message- > From: Kevin Oberman [mailto:kob6...@gmail.com] > Sent: Tuesday, February 14, 2012 11:51 AM > To: Devin Teske > Cc: Ian Smith; Bruce Cran; Alex Samorukov; Joe Holden; FreeBSD Stable Mailing > List > Subject: Re: New BSD Installer > > On Tue, Feb 14, 2012 at 9:43 AM, Devin Teske > wrote: > > > > > >> -Original Message- > >> From: owner-freebsd-sta...@freebsd.org [mailto:owner-freebsd- > >> sta...@freebsd.org] On Behalf Of Ian Smith > >> Sent: Tuesday, February 14, 2012 9:15 AM > >> To: Bruce Cran > >> Cc: FreeBSD Stable Mailing List; Joe Holden; Alex Samorukov > >> Subject: Re: New BSD Installer > >> > >> Strangely, the big push to GPT partitions was oft said to be because MBR > >> slices provided too few partitions. > > > > That's part of it (no pun intended). > > > > The other big deal is that you can't exceed 2TB on a single primary partition. > > > > > >> I never found 4 * 6 much of a limit > >> myself, and now the default install makes a Linux-like single partition, > >> rendering dump & restore more or less unusable or at least impractical, > > > > I'm with you on this one. I really don't like the single-"/" setup. > > > > > >> while booting multiple systems on GPT also seems to require Linux tools. > >> > >> I don't know whether this move away from BSD traditional filesystem > >> partitioning (/, /var, /usr etc) to Linux-style came down from Core On > >> High or is just the prerogative of installer-writers? Jordan was both > >> the latter and a big part of the former for many years, but I guess > >> that's something that can be reverted if people feel to do so. > >> > > > > Maybe a vote should be taken. There's about 12 votes in this office here alone > > for putting the partition scheme back the way it was (Colin Percival had a great > > formula for determining partition sizes). > > I suggest that both be implemented, which looks to the untrained eye > as a straight-forward thing to implement, and then the install ask if > a single partition or a traditional multi-partition system should be > installed. I prefer multi and use that on all of my systems. > > I also really prefer GPT for a variety of reasons, but we need better > tools to support things. I miss booteasy. Yes, you can get it to boot > from a different partition, but it is a pain. I deal with it by > putting FreeBSD on one disk and Windows on another when I want a > dual-boot system. I put the MBR formatted (Windows) is first in the > boot order, so I can just hit F5 to boot the FreeBSD disk. > > This works for me, but I suspect that lots of people would prefer > having multiple OSes on a single disk...especially when it's a single > spindle laptop. (I suspect laptops are more commonly dual-boot than > most any other platform.) > > As for fdisk and bsdlabel, I'm happy to see both go. They have a > horrid user interface and require a calculator to get right. Yes, I > use them, but only because there is no other way to do some things. > (sade(8) comes closer all of the time, though.) Please don't get rid of fdisk or bsdlabel as they are (and forever will be) required to do things like: 1. scripted formatting of a thumb drive 2. automated probing of disk information (fdisk -p) 3. Other tasks that are not suitably handled by curses-based utilities For example, the following command will create a second Windows partition on a thumb drive without user interaction: echo "p 2 0x0c * *" | fdisk -f - /dev/da0 If you take away fdisk, how am I supposed to achieve the above? -- Devin _ The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: disk devices speed is ugly
On 2012-Feb-13 08:28:21 -0500, Gary Palmer wrote: >The filesystem is the *BEST* place to do caching. It knows what metadata >is most effective to cache and what other data (e.g. file contents) doesn't >need to be cached. Agreed. > Any attempt to do this in layers between the FS and >the disk won't achieve the same gains as a properly written filesystem. Agreed - but traditionally, Unix uses this approach via block devices. For various reasons, FreeBSD moved caching into UFS and removed block devices. Unfortunately, this means that any FS that wants caching has to implement its own - and currently only UFS & ZFS do. What would be nice is a generic caching subsystem that any FS can use - similar to the old block devices but with hooks to allow the FS to request read-ahead, advise of unwanted blocks and ability to flush dirty blocks in a requested order with the equivalent of barriers (request Y will not occur until preceeding request X has been committed to stable media). This would allow filesystems to regain the benefits of block devices with minimal effort and then improve performance & cache efficiency with additional work. One downside of the "each FS does its own caching" in that the caches are all separate and need careful integration into the VM subsystem to prevent starvation (eg past problems with UFS starving ZFS L2ARC). -- Peter Jeremy pgpa3o0LQ2kfG.pgp Description: PGP signature
Re: problems with AHCI on FreeBSD 8.2
On Tue, Feb 14, 2012 at 08:31:23PM +0100, Oscar Prieto wrote: > I used to had tons of ahci errors in my 4 disk raidz1 worth of > HD154UIs when the rig was built a year ago or so (with 8.0 Release), > but they dissapeared after tuning ZFS. > > Sadly i also got a new timeout days ago followed with smartcl erros i > still keep unchecked but i guess they cold be legit, i still have to > test/swap cables and give it a try. About your ada3 disk: The below SMART errors indicate your disk does in fact have physical media problems -- 1 confirmed bad sector, and 5 which are "suspect". "Suspect" LBAs are unreadable until writes are issued to them. A write will induce the drive to re-analyse the sector at that LBA and determine if it's truly bad or not. A single LBA can actually take quite a long time to analyse (it depends on what the problem is), and may result in 30+ seconds of delay. You can either let the drive figure it out over normal usage patterns, or you can do it manually yourself time permitting. Your drive that shows read failures in the SMART self-test log gives you the LBA numbers; try reading from those LBAs first. I can explain this procedure in another thread/offline/whatever. (Does anyone read what I write, re: don't hijack the thread? :-) ) About all of your disks: All of your disks are undergoing regular/periodic SMART short and long tests. Please stop this; it really, truly does no good. You will experience performance hits during these tests. About timeouts: Timeouts seen on the controller and driver level can happen in this situation; this is universal. This is usually what features like Western Digital's TLER and Hitachi + Samsung's CCTL can help alleviate, but not fully solve. I think the ada(4) default timeout of 30 seconds is a decent value, to be quite honest, but I'm not sure what the AHCI driver timeout is. mav@ would need to clue me in, or I'd need to go look at the source. (Right now in my life is not a good time for me to be reviewing source code or looking at commits, sadly. Too much on my mind recently.) I can discuss the TLER/CCTL stuff more at length if needed, but to be blatantly honest, I would rather not and here's why: people begin to rely on these features to try and circumvent actual problems with their drives. Phrased differently: people on the Internet become incredibly focused on all of these timeout durations (TLER/CCTL vs. controller vs. driver vs. storage subsystem timeouts) and try to find some bizarre "perfect harmony" between them all. Instead, just leave them all alone and watch your drives for problems. Further details which pertain to Samsung drives: In your case, you run smartd(8), which periodically hits the drive with SMART requests, pulling attribute data down and parsing it. I believe your model is fine for this, but for similar Samsung models, I must strongly advise against this. There are well-documented problems with Samsung firmwares and SMART behaviour which can result in data loss (yes you read that right). Please see smartmontools' Wiki page on the matter for full details. Just make sure you're running a fixed firmware: http://sourceforge.net/apps/trac/smartmontools/wiki/SamsungF4EGBadBlocks Regarding throughput of the drives being slow (30-40MBytes/sec across a gigE link): This sounds more like a Samba tuning problem, but ZFS raidz isn't known for "amazing speed" per se. Please see a post of mine from a while back on how to tune Samba, which many followed up to with appreciation stating their throughput increased dramatically: http://lists.freebsd.org/pipermail/freebsd-stable/2011-February/061642.html I should follow up to that post with the following entry, because I've since updated my own smb.conf to tune things a bit better, and include comments as to the justifications: # # The below options increase throughput substantially. Be aware # that AIO support requires the aio.ko kernel module loaded, # and Samba to be built with AIO enabled. Important notes: # # 1) We explicitly disable sendfile(2) because it has known # problems on ZFS, including resulting in 2x the amount of memory # used on the machine (VM cache + ZFS cache). For further details, # see freebsd-fs or freebsd-stable thread, subject "8.1-STABLE: # zfs and sendfile: problem still exists". # # 2) (2011/10/03) socket options SO_SNDBUF and SO_RCVBUF do not # appear to matter on FreeBSD, or our sysctls somehow take care of # this (or maybe AIO?). The performance is the same with or without # these two socket options on 8.2-STABLE. # # 3) (2011/10/03) My previously-mentioned "aio write behind" option # is incorrect; see the officia smb.conf(5) man page for the syntax. # It's not a yes/no toggleable, thus serves no purpose. # socket options = TCP_NODELAY use sendfile = no min receivefile size = 16384 aio read size = 16384 aio write size = 16384 The rest is in the thread I linked. Hope this helps. -- | Jeremy Chadwick
Re: problems with AHCI on FreeBSD 8.2
Am Tue, 14 Feb 2012 20:24:32 +0100 schrieb Harald Schmalzbauer : > I guess it's always the firmware of the EcoGreen models which cause > these problems. Your drive isn't EG... > I don't remember exactly the different model numbers, but I'm sure > they were all EcoGreen. The lower power consumption was the reason to > choose these specific drives (different capacities and F2/F3 series > tried), with acceptable performance loss - I thought. But it turned > out that EcoGreen and NCQ as well as RAIDZ demands dont' fit > together... Hi, I intentionally did not buy any Eco or Green model because I don't like them (Load_Cycle_Count bugs and so on). I realized, I like to use 1 Watt more power but have the performance doubled. -- Martin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New BSD Installer
On Tue, Feb 14, 2012 at 9:43 AM, Devin Teske wrote: > > >> -Original Message- >> From: owner-freebsd-sta...@freebsd.org [mailto:owner-freebsd- >> sta...@freebsd.org] On Behalf Of Ian Smith >> Sent: Tuesday, February 14, 2012 9:15 AM >> To: Bruce Cran >> Cc: FreeBSD Stable Mailing List; Joe Holden; Alex Samorukov >> Subject: Re: New BSD Installer >> >> Strangely, the big push to GPT partitions was oft said to be because MBR >> slices provided too few partitions. > > That's part of it (no pun intended). > > The other big deal is that you can't exceed 2TB on a single primary partition. > > >> I never found 4 * 6 much of a limit >> myself, and now the default install makes a Linux-like single partition, >> rendering dump & restore more or less unusable or at least impractical, > > I'm with you on this one. I really don't like the single-"/" setup. > > >> while booting multiple systems on GPT also seems to require Linux tools. >> >> I don't know whether this move away from BSD traditional filesystem >> partitioning (/, /var, /usr etc) to Linux-style came down from Core On >> High or is just the prerogative of installer-writers? Jordan was both >> the latter and a big part of the former for many years, but I guess >> that's something that can be reverted if people feel to do so. >> > > Maybe a vote should be taken. There's about 12 votes in this office here alone > for putting the partition scheme back the way it was (Colin Percival had a > great > formula for determining partition sizes). I suggest that both be implemented, which looks to the untrained eye as a straight-forward thing to implement, and then the install ask if a single partition or a traditional multi-partition system should be installed. I prefer multi and use that on all of my systems. I also really prefer GPT for a variety of reasons, but we need better tools to support things. I miss booteasy. Yes, you can get it to boot from a different partition, but it is a pain. I deal with it by putting FreeBSD on one disk and Windows on another when I want a dual-boot system. I put the MBR formatted (Windows) is first in the boot order, so I can just hit F5 to boot the FreeBSD disk. This works for me, but I suspect that lots of people would prefer having multiple OSes on a single disk...especially when it's a single spindle laptop. (I suspect laptops are more commonly dual-boot than most any other platform.) As for fdisk and bsdlabel, I'm happy to see both go. They have a horrid user interface and require a calculator to get right. Yes, I use them, but only because there is no other way to do some things. (sade(8) comes closer all of the time, though.) -- R. Kevin Oberman, Network Engineer E-mail: kob6...@gmail.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: problems with AHCI on FreeBSD 8.2
schrieb Martin Sugioarto am 14.02.2012 19:23 (localtime): > Am Tue, 14 Feb 2012 18:17:19 +0100 > schrieb Harald Schmalzbauer : > >>> I find it interesting that, at least so far, the only people >>> reporting problems of this type with the ahci.ko driver are people >>> using Samsung disks. The only difference is that your models are >>> F1s while the OPs are F2s. >> I saw such timeouts long ago and mav@ had a look at my postings and he >> mentioned it could be a NCQ problem. >> I suspected the disks firmware. >> I never tracked it down further, because after replacing the Samsung >> (F3 in that case) disks with hitachi ones solved all my problems and >> gave a big performance kick as well (with zfs). >> You can find the discussion here: >> http://lists.freebsd.org/pipermail/freebsd-stable/2010-February/055374.html > Hi, > > I just want to add here that I am using 2 drives of type "Samsung > HD103SJ" (SpinPoint F3). And I did not have problems with ZFS and with > UFS either (for several years now). Everything has been deployed ontop > ada(4) since FreeBSD-8. > > Actually the speed is very good (sequential read at 140 MB/s and more). I guess it's always the firmware of the EcoGreen models which cause these problems. Your drive isn't EG... I don't remember exactly the different model numbers, but I'm sure they were all EcoGreen. The lower power consumption was the reason to choose these specific drives (different capacities and F2/F3 series tried), with acceptable performance loss - I thought. But it turned out that EcoGreen and NCQ as well as RAIDZ demands dont' fit together... -Harry signature.asc Description: OpenPGP digital signature
Re: hang during dump (reproducible)
On Feb 10, 2012, at 9:50 PM, Jake Holland wrote: > > Many thanks to Attilio Rao, Kostik Belousov, and Andriy Gapon. And anybody > else involved. > > However, when I looked at the commit I noticed this: >> $ svn log -r228424 svn://svn.freebsd.org/base > ... >> MFC after: 3 months (or never) > > I'm not sure whether "never" is still considered an option, but it would be > useful for me if 8.3 release, when it comes, does not hang this way during > panic. But thanks for the patch, regardless. > Agreed - if this commit could be MFC'd for 8.3 it would be much appreciated. -Andrew -- Andrew Boyerabo...@averesystems.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 9-stable : geli + one-disk ZFS fails
Hi, Martin Simmons writes: > Some random ideas: > > 1) Can you dd the whole of ada0s3.eli without errors? I just started it; will take some hours > 2) If you scrub a few more times, does it find the same number of errors each > time and are they always in that XNAT.tar file? I deleted the XNAT.tar; I also copied files by 'ssh tar -c | tar -xp' to rule out NFS, same type of errors; Looks like multiple scrubs give the same files but not the same number of chksum errors (to be confirmed) > 3) Can you try zfs without geli? sure, I will split the place in one partition with geli and one without > 4) Is the slice/partition layout definitely correct? I (still ???) use sysinstall to do the dirty computations in my place. This is what gpart says (looks OK (to me ...) : [root@cc ~]# gpart list ada0 Geom name: ada0 modified: false state: OK fwheads: 16 fwsectors: 63 last: 976773167 first: 63 entries: 4 scheme: MBR Providers: 1. Name: ada0s1 Mediasize: 40802001408 (38G) Sectorsize: 512 Stripesize: 0 Stripeoffset: 32256 Mode: r0w0e0 rawtype: 7 length: 40802001408 offset: 32256 type: ntfs index: 1 end: 79691471 start: 63 2. Name: ada0s2 Mediasize: 34359607296 (32G) Sectorsize: 512 Stripesize: 0 Stripeoffset: 2147328000 Mode: r3w3e5 attrib: active rawtype: 165 length: 34359607296 offset: 40802033664 type: freebsd index: 2 end: 146800079 start: 79691472 3. Name: ada0s3 Mediasize: 424946221056 (395G) Sectorsize: 512 Stripesize: 0 Stripeoffset: 2147196928 Mode: r1w1e1 rawtype: 165 length: 424946221056 offset: 75161640960 type: freebsd index: 3 end: 976773167 start: 146800080 Consumers: 1. Name: ada0 Mediasize: 500107862016 (465G) Sectorsize: 512 Mode: r4w4e10 Merci, Arno > __Martin > > >> On Mon, 13 Feb 2012 23:39:06 +0100, Arno J Klaassen said: >> >> hello, >> >> to eventually gain interest in this issue : >> >> I updated to today's -stable, tested with vfs.zfs.debug=1 >> and vfs.zfs.prefetch_disable=0, no difference. >> >> I also tested to read the raw partition : >> >> [root@cc /usr/ports]# dd if=/dev/ada0s3 of=/dev/null bs=4096 conv=noerror >> 103746636+0 records in >> 103746636+0 records out >> 424946221056 bytes transferred in 13226.346738 secs (32128768 bytes/sec) >> [root@cc /usr/ports]# >> >> Disk is brand new, looks ok, either my setup is not good or there is >> a bug somewhere; I can play around with this box for some more time, >> please feel free to provide me with some hints what to do to be useful >> for you. >> >> Best, >> >> Arno >> >> >> "Arno J. Klaassen" writes: >> >> > Hello, >> > >> > >> > I finally decided to 'play' a bit with ZFS on a notebook, some years >> > old, but I installed a brand new disk and memtest passes OK. >> > >> > I installed base+ports on partition 2, using 'classical' UFS. >> > >> > I crypted partition 3 and created a single zpool on it containing >> > 4 Z-"file-systems" : >> > >> > [root@cc ~]# zfs list >> > NAME USED AVAIL REFER MOUNTPOINT >> > zfiles 10.7G 377G 152K /zfiles >> > zfiles/home 10.6G 377G 119M /zfiles/home >> > zfiles/home/arno 10.5G 377G 2.35G /zfiles/home/arno >> > zfiles/home/arno/.priv192K 377G 192K /zfiles/home/arno/.priv >> > zfiles/home/arno/.scito 8.18G 377G 8.18G /zfiles/home/arno/.scito >> > >> > >> > I export the ZFS's via nfs and rsynced on the other machine some backup >> > of my current note-book (geli + UFS, (almost) same 9-stable version, no >> > problem) to the ZFS's. >> > >> > >> > Quite fast, I see on the notebook : >> > >> > >> > [root@cc /usr/temp]# zpool status -v >> >pool: zfiles >> > state: ONLINE >> > status: One or more devices has experienced an error resulting in data >> > corruption. Applications may be affected. >> > action: Restore the file in question if possible. Otherwise restore the >> > entire pool from backup. >> > see: http://www.sun.com/msg/ZFS-8000-8A >> >scan: scrub repaired 0 in 0h1m with 11 errors on Sat Feb 11 14:55:34 >> >2012 >> > config: >> > >> > NAME STATE READ WRITE CKSUM >> > zfilesONLINE 0 011 >> >ada0s3.eli ONLINE 0 023 >> > >> > errors: Permanent errors have been detected in the following files: >> > >> > /zfiles/home/arno/.scito/contrib/XNAT.tar >> > [root@cc /usr/temp]# md5 /zfiles/home/arno/.scito/contrib/XNAT.tar >> > md5: /zfiles/home/arno/.scito/contrib/XNAT.tar: Input/output error >> > [root@cc /usr/temp]# >> > >> > >> > As said, memtest is OK, nothing is logged to the console, UFS on the >> > same disk works OK (I did some tests copying and comparing random data) >> > and smartctl as well seems to trust the disk : >> > >> > SMART Self-test log structure revision number 1 >> > Num Test_Description
Re: freebsd 9-stable TOP problem from around Jan 10
On Tue, Feb 14, 2012 at 12:23 AM, Julian Elischer wrote: > Has anyone else seen a problem with top -H -S? > > after a short while the screen gets more and more corrupted.. > > hitting ^L or turning off S & H modes helps .. for a while. > > If this is a known fixed problem, let me know but I need to co-ordinate with > others > to upgrade the machine in question. Not seeing it here on 9-stable. Could it be a display issue? I am using gnome-terminal with TERM defined as 'xterm'. -- R. Kevin Oberman, Network Engineer E-mail: kob6...@gmail.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: problems with AHCI on FreeBSD 8.2
Am Tue, 14 Feb 2012 18:17:19 +0100 schrieb Harald Schmalzbauer : > > I find it interesting that, at least so far, the only people > > reporting problems of this type with the ahci.ko driver are people > > using Samsung disks. The only difference is that your models are > > F1s while the OPs are F2s. > > I saw such timeouts long ago and mav@ had a look at my postings and he > mentioned it could be a NCQ problem. > I suspected the disks firmware. > I never tracked it down further, because after replacing the Samsung > (F3 in that case) disks with hitachi ones solved all my problems and > gave a big performance kick as well (with zfs). > You can find the discussion here: > http://lists.freebsd.org/pipermail/freebsd-stable/2010-February/055374.html Hi, I just want to add here that I am using 2 drives of type "Samsung HD103SJ" (SpinPoint F3). And I did not have problems with ZFS and with UFS either (for several years now). Everything has been deployed ontop ada(4) since FreeBSD-8. Actually the speed is very good (sequential read at 140 MB/s and more). -- Martin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 6.2-Release ..ish.. CF + ata == freeze?
On Tue, 2012-02-14 at 00:12 -0500, Jason Hellenthal wrote: > > On Mon, Feb 13, 2012 at 08:43:08PM -0800, john fleming wrote: > > Just thought i would post over here as i'm not getting a warm fuzzy from > > checkpoint about being able to find the root cause of an issue. I have a > > large install base of IPSO checkpoint firewalls, which are based on FreeBSD > > 6.2. I've had 3 firewalls hang basically the same way, with something that > > looks like a filesystem issue or an issue with a CF card. > > > > Does anyone happen to know of any bugs (i've been looking around) that > > could cause something like that? Granted, it could be a batch of bad CF > > cards, but its odd that i'm seeing the same thing on 3 different boxes and > > once rebooted they seem ok. > > > > Also is it possible to get useful info form the atacontroller when things > > go south like this from the ddb prompt? > > > > This is what shows in show msgbuf > > ad0: timeout waiting to issue command > > ad0: error issuing WRITE command > > ad0: timeout waiting to issue command > > ad0: error issuing WRITE command > > ad0: timeout waiting to issue command > > ad0: error issuing WRITE command > > ad0: timeout waiting to issue command > > ad0: error issuing WRITE command > > g_vfs_done():ad0s4h[WRITE(offset=33849344, length=131072)]error = 5 > > g_vfs_done():ad0s4h[WRITE(offset=33980416, length=131072)]error = 5 > > g_vfs_done():ad0s4h[WRITE(offset=34111488, length=131072)]error = 5 > > g_vfs_done():ad0s4h[WRITE(offset=34242560, length=131072)]error = 5 > > g_vfs_done():ad0s4h[WRITE(offset=34373632, length=131072)]error = 5 > > > > ad0: 1882MB at ata0-master PIO4 > > atapci0: port > > 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x5070-0x507f mem 0x80301000-0x803013ff > > at device 31.1 on pci0 > > ata0: on atapci0 > > ata1: on atapci0 > > atapci1: port > > 0x5088-0x508f,0x50a4-0x50a7,0x5080-0x5087,0x50a0-0x50a3,0x5060-0x506f irq > > 15 at device 31.2 on pci0 > > ata2: on atapci1 > > ata3: on atapci1ad0s4h is basically a r/w ufs partition on > > the box where almost anything that needs to be written goes. > > trace > > Tracing pid 1101 tid 100043 td 0x656d8460 > > kdb_enter(608cc388,6246,656d8460,64ba1400,6095d580,...) at kdb_enter+0x2b > > siointr1(64ba1400) at siointr1+0xf0 > > siointr(64ba1400) at siointr+0x38 > > intr_execute_handler(6095d580,f0a4ab04,6,6095d580,f0a4aafc,...) at > > intr_execute_handler+0x61 > > intr_execute_handlers(6095d580,f0a4ab04,6,0,656d8460,...) at > > intr_execute_handlers+0x40 > > atpic_handle_intr(4) at atpic_handle_intr+0x96 > > Xatpic_intr4() at Xatpic_intr4+0x20 > > --- interrupt, eip = 0x606044af, esp = 0xf0a4ab48, ebp = 0xf0a4ab5c --- > > lockmgr(e1456a04,6,0,656d8460) at lockmgr+0x58f > > getdirtybuf(e14569a4,60a405e4,1) at getdirtybuf+0x2e2 > > flush_deplist(68b30850,1,f0a4abb8) at flush_deplist+0x30 > > flush_inodedep_deps(656fa28c,1f235) at flush_inodedep_deps+0xcf > > softdep_sync_metadata(65964618) at softdep_sync_metadata+0x61 > > ffs_syncvnode(65964618,1) at ffs_syncvnode+0x3a2 > > ffs_fsync(f0a4ac74) at ffs_fsync+0x12 > > VOP_FSYNC_APV(60949260,f0a4ac74) at VOP_FSYNC_APV+0x38 > > fsync(656d8460,f0a4acb4) at fsync+0x170 > > syscall(805003b,806003b,5fbf003b,805,288be450,...) at syscall+0x2ee > > Xint0x80_syscall() at Xint0x80_syscall+0x1f > > This looks to be a problem with softupdates and CF cards. Can you get > this to repeat on a brand new (good) card ? > EIO errors on a write that lead to a panic nearly always backtrace into the softupdates code, because that code pretty much has to panic if it can't write things in the proper order. That doesn't imply that the softupdates code is at fault in any way, or that the errors would go away if softupdates were turned off. In fact, I consider it important to have softupdates enabled on CF and SDCard media. The number of writes (and especially of repeated re-writes of the same filesystem metadata sectors) goes way way up without SU enabled, and that's bad for media with a limited number of write cycles in its lifetime. We've been using 6.2 with SU enabled on CF cards for many years at Symmetricom; we're still shipping systems with that config. Depending on the motherboard or SBC, we often have to disable ata DMA, or limit it to a max of WDMA2 mode. The indication that you need to do so is typically a lockup either trying to load the kernel and modules, or sometimes that works but it locks up while initializing the ata driver. [1] If your systems have been running fine with DMA enabled, it's not the sort of problem that suddenly appears out of the blue. You find out when you need to disable it pretty quickly on new hardware because it doesn't boot reliably. I tend to agree with Jeremy's assesment that you may have some CF cards that have neared the end of their life, and especially if they're full the automatic wear leveling can't find any un-worn cells to use. If the cards are old they may have primitive wear-leve
Re: 9-stable : geli + one-disk ZFS fails
Hallo Aleksandr, > Hello, Arno J. Klaassen! > > On Sat, Feb 11, 2012 at 04:53:10PM +0100 > a...@heho.snv.jussieu.fr wrote about "9-stable : geli + one-disk ZFS fails": >> >> Hello, >> >> >> I finally decided to 'play' a bit with ZFS on a notebook, some years >> old, but I installed a brand new disk and memtest passes OK. >> >> I installed base+ports on partition 2, using 'classical' UFS. >> >> I crypted partition 3 and created a single zpool on it containing >> 4 Z-"file-systems" : >> >> [root@cc ~]# zfs list >> NAME USED AVAIL REFER MOUNTPOINT >> zfiles 10.7G 377G 152K /zfiles >> zfiles/home 10.6G 377G 119M /zfiles/home >> zfiles/home/arno 10.5G 377G 2.35G /zfiles/home/arno >> zfiles/home/arno/.priv192K 377G 192K /zfiles/home/arno/.priv >> zfiles/home/arno/.scito 8.18G 377G 8.18G /zfiles/home/arno/.scito >> >> >> I export the ZFS's via nfs and rsynced on the other machine some backup >> of my current note-book (geli + UFS, (almost) same 9-stable version, no >> problem) to the ZFS's. >> >> >> Quite fast, I see on the notebook : >> >> >> [root@cc /usr/temp]# zpool status -v >>pool: zfiles >> state: ONLINE >> status: One or more devices has experienced an error resulting in data >> corruption. Applications may be affected. >> action: Restore the file in question if possible. Otherwise restore the >> entire pool from backup. >> see: http://www.sun.com/msg/ZFS-8000-8A >>scan: scrub repaired 0 in 0h1m with 11 errors on Sat Feb 11 14:55:34 >>2012 >> config: >> >> NAME STATE READ WRITE CKSUM >> zfilesONLINE 0 011 >>ada0s3.eli ONLINE 0 023 >> >> errors: Permanent errors have been detected in the following files: >> >> /zfiles/home/arno/.scito/contrib/XNAT.tar >> [root@cc /usr/temp]# md5 /zfiles/home/arno/.scito/contrib/XNAT.tar >> md5: /zfiles/home/arno/.scito/contrib/XNAT.tar: Input/output error >> [root@cc /usr/temp]# >> >> >> As said, memtest is OK, nothing is logged to the console, UFS on the >> same disk works OK (I did some tests copying and comparing random data) >> and smartctl as well seems to trust the disk : >> >> SMART Self-test log structure revision number 1 >> Num Test_DescriptionStatus Remaining LifeTime(hours) >> # 1 Extended offlineCompleted without error 00% 388 >> # 2 Short offline Completed without error 00% 387 >> >> >> Am I doing something wrong and/or let me know what I could provide as >> extra info to try to solve this (dmesg.boot at the end of this mail). >> >> Thanx a lot in advance, >> >> best, Arno > > Arno, you forgot to say how are you create geli partiotion. > It is important. geli init /dev/ada0s3 (should I have used ' -s 4096 ' ???) I added later : geli attach -k /tmp/ifmemoryfails.key1 -p /dev/ada0s3 In fact, on my regular laptop on which I now use UFS on top of GELI I use /dev/ada0s3f, not the whole partition Hope this helps ;-) thanx, best, Arno ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
RE: New BSD Installer
> -Original Message- > From: owner-freebsd-sta...@freebsd.org [mailto:owner-freebsd- > sta...@freebsd.org] On Behalf Of Lars Engels > Sent: Tuesday, February 14, 2012 9:28 AM > To: Ian Smith > Cc: Bruce Cran; Alex Samorukov; Joe Holden; FreeBSD Stable Mailing List > Subject: Re: New BSD Installer > > On Wed, Feb 15, 2012 at 04:15:17AM +1100, Ian Smith wrote: > > On Sun, 12 Feb 2012 15:32:51 +, Bruce Cran wrote: > > > On 2/10/2012 7:47 PM, Alex Samorukov wrote: > > > > I am highly against reverting. Old installer is not GPT aware and in > > fact > > > > is unmaintained for a very long time. > > > > > > That's not really correct: quite a lot of work was done on it last year. > > > > Indeed. Was it you working on the updated sade(8) adding GPT and ZFS? > > > > > > > > I don't see it in terms of reverting. Much other useful functionality > > of sysinstall has yet to be reimplemented. > > What exactly are you missing? > There's sysutils/host-setup to configure your system like sysinstall > did. sysutils/host-setup (written/maintained by me) is only good for the following bits right now: 1. Time zone 2. Hostname/Domain 3. Network Interfaces 4. Default Router/Gateway 5. DNS nameservers There's still quite a bit more that sysinstall(8) offered which isn't provided by anything yet (sysutils/host-setup included). > There's sade and I am working on a tool to browse and add packages from > the installation media and / or the ftp mirrors. Ron McDowell and I are working on a new tool named "bsdconfig(8)" which is very modular and written in sh(1). bsdconfig(8) is designed squarely at reimplementing all of the sysinstall(8) post-install bits so that we can cleanly whack sysinstall(8) without the prior complaints. The portion of bsdconfig(8) that will handle browsing and adding of packages from either the installation media or ftp mirrors is incomplete at the moment, and we'd love it if you were willing to either: (a) download the preliminary framework for bsdconfig(8) and start working on the packages module, or (b) join the SourceForge CVS project and start working on bsdconfig(8) in realtime with Ron and I NOTE: Choice of either option will result in further information being disbursed for your digestive pleasure. So far, bsdconfig(8) has the 8529 lines of code (counting all modules, internationalization files, and Makefiles) with the following modules/components (status listed for each): 1. Distributions Description: Install additional distribution sets Status: pending development 2. Documentation installation Description: Install FreeBSD Documentation set Status: Done. Links to "bsdinstall docsinstall" 3. Packages Description: Install Pre-packaged Software Status: pending development 4. Password Description: Set Root Password Status: pending development 5. Fdisk Description: Fdisk Partition Editor Status: pending development Note: Could be linked directly to sade(8) 6. Disklabel Description: Disk Label Editor Status: pending development Note: Could be linked directly to sade(8) 7. Login/Group Management Description: Add user's login and group information Status: Done (by Ron McDowell) 8. Console Description: Console Settings Status: pending development 9. Timezone Description: Set up Time Zone Status: Done (by Devin Teske; me) NOTE: Functionality shamelessly ripped from my ports addition: sysutils/tzdialog 10. Media Selection Description: Select Media to Install From Status: pending development 11. Mouse Description: Configure the Mouse Status: pending development 12. Networking Management Description: Setup Networking interfaces, services, etc. Status: Done (by Devin Teske; me) NOTE: Functionality shamelessly ripped from my ports addition: sysutils/host-setup 13. Security Description: Set Security Parameters Status: pending development 14. Startup Description: Set Startup Parameters Status: pending development 15. Ttys Description: Configure Ttys Status: pending development I am currently working on the framework some more and then I'm going to jump over to working on #14 "Startup". As you can see from the above-list, we have quite a bit of functionality to migrate from sysinstall(8) over to bsdconfig(8) -- however the most difficult bits (user management, network management, and timezone have all been done so the rest should fall like a house of cards -- especially since we have really nice modular includes making the modules nice and light-weight). -- Devin _ The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in
[releng_9 tinderbox] failure on ia64/ia64
TB --- 2012-02-14 15:28:02 - tinderbox 2.9 running on freebsd-stable.sentex.ca TB --- 2012-02-14 15:28:02 - starting RELENG_9 tinderbox run for ia64/ia64 TB --- 2012-02-14 15:28:02 - cleaning the object tree TB --- 2012-02-14 15:28:02 - cvsupping the source tree TB --- 2012-02-14 15:28:02 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup.sentex.ca /tinderbox/RELENG_9/ia64/ia64/supfile TB --- 2012-02-14 15:29:05 - building world TB --- 2012-02-14 15:29:05 - CROSS_BUILD_TESTING=YES TB --- 2012-02-14 15:29:05 - MAKEOBJDIRPREFIX=/obj TB --- 2012-02-14 15:29:05 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2012-02-14 15:29:05 - SRCCONF=/dev/null TB --- 2012-02-14 15:29:05 - TARGET=ia64 TB --- 2012-02-14 15:29:05 - TARGET_ARCH=ia64 TB --- 2012-02-14 15:29:05 - TZ=UTC TB --- 2012-02-14 15:29:05 - __MAKE_CONF=/dev/null TB --- 2012-02-14 15:29:05 - cd /src TB --- 2012-02-14 15:29:05 - /usr/bin/make -B buildworld >>> World build started on Tue Feb 14 15:29:06 UTC 2012 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything >>> World build completed on Tue Feb 14 17:15:08 UTC 2012 TB --- 2012-02-14 17:15:08 - generating LINT kernel config TB --- 2012-02-14 17:15:08 - cd /src/sys/ia64/conf TB --- 2012-02-14 17:15:08 - /usr/bin/make -B LINT TB --- 2012-02-14 17:15:08 - cd /src/sys/ia64/conf TB --- 2012-02-14 17:15:08 - /usr/sbin/config -m LINT TB --- 2012-02-14 17:15:08 - building LINT kernel TB --- 2012-02-14 17:15:08 - CROSS_BUILD_TESTING=YES TB --- 2012-02-14 17:15:08 - MAKEOBJDIRPREFIX=/obj TB --- 2012-02-14 17:15:08 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2012-02-14 17:15:08 - SRCCONF=/dev/null TB --- 2012-02-14 17:15:08 - TARGET=ia64 TB --- 2012-02-14 17:15:08 - TARGET_ARCH=ia64 TB --- 2012-02-14 17:15:08 - TZ=UTC TB --- 2012-02-14 17:15:08 - __MAKE_CONF=/dev/null TB --- 2012-02-14 17:15:08 - cd /src TB --- 2012-02-14 17:15:08 - /usr/bin/make -B buildkernel KERNCONF=LINT >>> Kernel build for LINT started on Tue Feb 14 17:15:09 UTC 2012 >>> stage 1: configuring the kernel >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3.1: making dependencies >>> stage 3.2: building everything [...] /src/sys/modules/mps/../../dev/mps/mps_sas_lsi.c:329: warning: implicit declaration of function 'mpssas_find_target_by_handle' /src/sys/modules/mps/../../dev/mps/mps_sas_lsi.c:329: warning: nested extern declaration of 'mpssas_find_target_by_handle' [-Wnested-externs] /src/sys/modules/mps/../../dev/mps/mps_sas_lsi.c:329: warning: assignment makes pointer from integer without a cast /src/sys/modules/mps/../../dev/mps/mps_sas_lsi.c:396: warning: implicit declaration of function 'mpssas_prepare_volume_remove' /src/sys/modules/mps/../../dev/mps/mps_sas_lsi.c:396: warning: nested extern declaration of 'mpssas_prepare_volume_remove' [-Wnested-externs] /src/sys/modules/mps/../../dev/mps/mps_sas_lsi.c:403: warning: assignment makes pointer from integer without a cast /src/sys/modules/mps/../../dev/mps/mps_sas_lsi.c:469: warning: assignment makes pointer from integer without a cast /src/sys/modules/mps/../../dev/mps/mps_sas_lsi.c:481: warning: assignment makes pointer from integer without a cast *** Error code 1 Stop in /src/sys/modules/mps. *** Error code 1 Stop in /src/sys/modules. *** Error code 1 Stop in /obj/ia64.ia64/src/sys/LINT. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2012-02-14 17:49:05 - WARNING: /usr/bin/make returned exit code 1 TB --- 2012-02-14 17:49:05 - ERROR: failed to build LINT kernel TB --- 2012-02-14 17:49:05 - 5814.42 user 834.07 system 8463.93 real http://tinderbox.freebsd.org/tinderbox-releng_9-RELENG_9-ia64-ia64.full ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
RE: New BSD Installer
> -Original Message- > From: owner-freebsd-sta...@freebsd.org [mailto:owner-freebsd- > sta...@freebsd.org] On Behalf Of Ian Smith > Sent: Tuesday, February 14, 2012 9:15 AM > To: Bruce Cran > Cc: FreeBSD Stable Mailing List; Joe Holden; Alex Samorukov > Subject: Re: New BSD Installer > > On Sun, 12 Feb 2012 15:32:51 +, Bruce Cran wrote: > > On 2/10/2012 7:47 PM, Alex Samorukov wrote: > > > I am highly against reverting. Old installer is not GPT aware and in fact > > > is unmaintained for a very long time. > > > > That's not really correct: quite a lot of work was done on it last year. > > Indeed. Was it you working on the updated sade(8) adding GPT and ZFS? > > > > I don't see it in terms of reverting. Much other useful functionality > of sysinstall has yet to be reimplemented. Ron McDowell and I are working feverishly on bsdconfig(8) set to arrive in 10.0-CURRENT Highlights: - It's modular - It's easily estendable/maintained (written in sh(1)) - It's goal is to completely reimplement all missing functionality from sysinstall(8) However, it's still in the preliminary stages. Discussions on bsdconfig are being held on -sysinstall@ Development work is being performed off the reservation (using SourceForge CVS server) until we can agree on the structure prior to import to the base of HEAD SVN tree. Despite being preliminary code, there is currently 8529 lines of code so far. I won't be posting links to the preliminary code (it's still preliminary) for fear of getting too much feedback too early in the game (but if you're interested, you can crawl the recent posts to -sysinstall@ and gleen the links both from Ron and myself). > Sure, I know, send code .. > but it's not only the functionality lost, but the ability for new users > to accomplish a good deal of initial server setup before they're skilled > enough to do it all from the command line, which is where I was in '98. > bsdconfig(8) will fill this gap as sysinstall(8) did in the past. The current plan moving forward is: 1. RELENG_9 will continue to offer both sysinstall and bsdinstall in the installed base 2. RELENG_10 will drop sysinstall(8) but bring in bsdconfig(8) This much has been agreed upon in the discussions involving many. > I also think much of the sometimes gratuitous deprecation of sysinstall > is unwarranted. Yes, it has been acknowledged by many that the scheduled deprecation is aggressive. > I've used sysinstall post-installation regularly since > '98 on 2.2.6 through 3.3, 4.4-10, 5.-5, 6.1, 7.0-4 and 8.0-2. Since one > small disaster on 3.3 about 12 years ago (installing to the wrong slice) > I've had no major issues with it, mostly partitioning all sorts of disks > but also browsing and adding useful packages at installation. > When bsdconfig(8) reaches a usable state (is entered into HEAD), we encourage you to be an avid tester in the early stages to make sure we "get it right" with respect to replication of sysinstall(8) features. bsdconfig(8) should work fine on RELENG_9 just as 10.0-CURRENT > Strangely, the big push to GPT partitions was oft said to be because MBR > slices provided too few partitions. That's part of it (no pun intended). The other big deal is that you can't exceed 2TB on a single primary partition. > I never found 4 * 6 much of a limit > myself, and now the default install makes a Linux-like single partition, > rendering dump & restore more or less unusable or at least impractical, I'm with you on this one. I really don't like the single-"/" setup. > while booting multiple systems on GPT also seems to require Linux tools. > > I don't know whether this move away from BSD traditional filesystem > partitioning (/, /var, /usr etc) to Linux-style came down from Core On > High or is just the prerogative of installer-writers? Jordan was both > the latter and a big part of the former for many years, but I guess > that's something that can be reverted if people feel to do so. > Maybe a vote should be taken. There's about 12 votes in this office here alone for putting the partition scheme back the way it was (Colin Percival had a great formula for determining partition sizes). > I expect most developers run mostly the latest gear, and nowadays tend > to use vbox images a good deal, but there will be many laptops and other > systems using MBR slices and bsdlabel partitions for years to come, and > I'd hate to see FreeBSD's longterm support for _slightly_ older hardware > disappear, just because of having added better support for latest kit. > Others will point out that if you try hard enough, you can create the old-style MBR partitions with RELENG_9 (note: some minor bugs were documented in 9.0-RELEASE; the next release will not suffer these fallbacks). > I for one will be screwed if sade, fdisk and bsdlabel disappear, as the > release notes for 9 seem to indicate may be imminently on the cards. > I too would be sad if those disappear. However, I do think
Re: CARP carpdev
On Tue, Feb 14, 2012 at 8:56 AM, Hugo Silva wrote: > Looks like there's been conversations about porting this to FreeBSD since at > least 2007. > > Are there any plans to have ifconfig carpdev available in 9.0-STABLE? CARP support has been redone in 10-CURRENT, removing the whole "carp0" pseudo-interface support, and just enabling the CARP protocol on the existing network interfaces. This includes the equivalent of "carpdev" support. Search the -current archives for more information, CFT, and so on. I don't recall seeing anything about specific plans to MFC to stable/9, but could be mis-remembering things. -- Freddie Cash fjwc...@gmail.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New BSD Installer
On Wed, Feb 15, 2012 at 04:15:17AM +1100, Ian Smith wrote: > On Sun, 12 Feb 2012 15:32:51 +, Bruce Cran wrote: > > On 2/10/2012 7:47 PM, Alex Samorukov wrote: > > > I am highly against reverting. Old installer is not GPT aware and in fact > > > is unmaintained for a very long time. > > > > That's not really correct: quite a lot of work was done on it last year. > > Indeed. Was it you working on the updated sade(8) adding GPT and ZFS? > > > > I don't see it in terms of reverting. Much other useful functionality > of sysinstall has yet to be reimplemented. What exactly are you missing? There's sysutils/host-setup to configure your system like sysinstall did. There's sade and I am working on a tool to browse and add packages from the installation media and / or the ftp mirrors. pgpjMtFvgk8XW.pgp Description: PGP signature
Re: problems with AHCI on FreeBSD 8.2
schrieb Jeremy Chadwick am 14.02.2012 17:50 (localtime): > On Tue, Feb 14, 2012 at 04:55:10PM +0100, Claudius Herder wrote: >> Hello, >> >> I have got a quite similar problem with AHCI on FreeBSD 8.2 and it still >> persists on FreeBSD 9.0 release. >> >> Switching from ahci to ataahci resolved the problem for me too. >> >> I'm using gmirror for swap, system is on a zpool and the problem first >> occurred during a zpool scrub, but it is easily reproducible with dd. >> >> The timeouts only occur when writing to disks, dd if=/dev/ada{0|1} >> of=/dev/null is not an issue. >> Sometimes I need to power off the server because after a reboot one disk >> is still missing. >> >> I really would like to help in this issue, so let me know if you need >> any more information. > I find it interesting that, at least so far, the only people reporting > problems of this type with the ahci.ko driver are people using Samsung > disks. The only difference is that your models are F1s while the OPs > are F2s. I saw such timeouts long ago and mav@ had a look at my postings and he mentioned it could be a NCQ problem. I suspected the disks firmware. I never tracked it down further, because after replacing the Samsung (F3 in that case) disks with hitachi ones solved all my problems and gave a big performance kick as well (with zfs). You can find the discussion here: http://lists.freebsd.org/pipermail/freebsd-stable/2010-February/055374.html JFI -Harry signature.asc Description: OpenPGP digital signature
Re: LSI supported mps(4) driver in stable/9 and stable/8
On Mon, Feb 13, 2012 at 15:08:45 +0100, Ollivier Robert wrote: > According to Kenneth D. Merry: > > The LSI-supported version of the mps(4) driver that supports their 6Gb SAS > > HBAs as well as WarpDrive controllers, is now in stable/9 and stable/8. > > Thanks. > > > Note that the CAM infrastructure changes that went into FreeBSD/head along > > with this driver have not gone into either stable/9 or stable/8. Only the > > driver itself has been merged. > > > > The CAM infrastructure changes depend on some other da(4) driver changes > > that will need to get merged before they can go back. If that merge > > happens, it will probably only be into stable/9. > > Got an ETA for this? Saying differently, is it reasonable to run stable/9 > with the new driver but w/o the CAM changes? What do these changes bring > BTW? Sorry, been out-of-touch these days :( > No ETA for the CAM changes. I need to talk with Alexander Motin about it, and I haven't gotten around to that. Too busy with other things. The changes just allow the driver to get notification from CAM about read capacity data instead of having the driver probe by itself. The probe in the driver for stable is kludgy, but does work. So it is perfectly fine to run the driver in stable/9 or stable/8 without the CAM changes. The latest mps(4) driver changes have been merged into stable/9 and stable/8, so this would be a good time to try it out. Ken -- Kenneth Merry k...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New BSD Installer
On Sun, 12 Feb 2012 15:32:51 +, Bruce Cran wrote: > On 2/10/2012 7:47 PM, Alex Samorukov wrote: > > I am highly against reverting. Old installer is not GPT aware and in fact > > is unmaintained for a very long time. > > That's not really correct: quite a lot of work was done on it last year. Indeed. Was it you working on the updated sade(8) adding GPT and ZFS? I don't see it in terms of reverting. Much other useful functionality of sysinstall has yet to be reimplemented. Sure, I know, send code .. but it's not only the functionality lost, but the ability for new users to accomplish a good deal of initial server setup before they're skilled enough to do it all from the command line, which is where I was in '98. I also think much of the sometimes gratuitous deprecation of sysinstall is unwarranted. I've used sysinstall post-installation regularly since '98 on 2.2.6 through 3.3, 4.4-10, 5.-5, 6.1, 7.0-4 and 8.0-2. Since one small disaster on 3.3 about 12 years ago (installing to the wrong slice) I've had no major issues with it, mostly partitioning all sorts of disks but also browsing and adding useful packages at installation. Strangely, the big push to GPT partitions was oft said to be because MBR slices provided too few partitions. I never found 4 * 6 much of a limit myself, and now the default install makes a Linux-like single partition, rendering dump & restore more or less unusable or at least impractical, while booting multiple systems on GPT also seems to require Linux tools. I don't know whether this move away from BSD traditional filesystem partitioning (/, /var, /usr etc) to Linux-style came down from Core On High or is just the prerogative of installer-writers? Jordan was both the latter and a big part of the former for many years, but I guess that's something that can be reverted if people feel to do so. I expect most developers run mostly the latest gear, and nowadays tend to use vbox images a good deal, but there will be many laptops and other systems using MBR slices and bsdlabel partitions for years to come, and I'd hate to see FreeBSD's longterm support for _slightly_ older hardware disappear, just because of having added better support for latest kit. I for one will be screwed if sade, fdisk and bsdlabel disappear, as the release notes for 9 seem to indicate may be imminently on the cards. cheers, Ian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
CARP carpdev
Looks like there's been conversations about porting this to FreeBSD since at least 2007. Are there any plans to have ifconfig carpdev available in 9.0-STABLE? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Complete hang on 9.0-RELEASE
Hi folks, For the records, I was running some tests yesterday on top of a 9.0-RELEASE, amd64, kernel when the box hanged. At the time of the hang, the box was running a process with about 2800 threads with heavy IPC between 1400 writers and 1400 readers. The box was in single user mode (/bin/sh coming from FreeBSD 7.4-STABLE). Here is the beginning of the dmesg: Copyright (c) 1992-2012 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 9.0-RELEASE #0: Tue Jan 3 07:46:30 UTC 2012 r...@farrell.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 CPU: Intel(R) Atom(TM) CPU D510 @ 1.66GHz (1666.70-MHz K8-class CPU) Origin = "GenuineIntel" Id = 0x106ca Family = 6 Model = 1c Stepping = 10 Features=0xbfebfbff Features2=0x40e31d AMD Features=0x2800 AMD Features2=0x1 TSC: P-state invariant, performance statistics real memory = 2137587712 (2038 MB) avail memory = 2037841920 (1943 MB) Event timer "LAPIC" quality 400 ACPI APIC Table: <070611 APIC1125> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs FreeBSD/SMP: 1 package(s) x 2 core(s) x 2 HTT threads cpu0 (BSP): APIC ID: 0 cpu1 (AP/HT): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP/HT): APIC ID: 3 I will restart the test and see if this happens again. regards, - Arnaud ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: problems with AHCI on FreeBSD 8.2
On Tue, Feb 14, 2012 at 04:55:10PM +0100, Claudius Herder wrote: > > Hello, > > I have got a quite similar problem with AHCI on FreeBSD 8.2 and it still > persists on FreeBSD 9.0 release. > > Switching from ahci to ataahci resolved the problem for me too. > > I'm using gmirror for swap, system is on a zpool and the problem first > occurred during a zpool scrub, but it is easily reproducible with dd. > > The timeouts only occur when writing to disks, dd if=/dev/ada{0|1} > of=/dev/null is not an issue. > Sometimes I need to power off the server because after a reboot one disk > is still missing. > > I really would like to help in this issue, so let me know if you need > any more information. I find it interesting that, at least so far, the only people reporting problems of this type with the ahci.ko driver are people using Samsung disks. The only difference is that your models are F1s while the OPs are F2s. The only difference I can think of is that the ahci.ko driver may have more strict timeouts than the ata driver (ata driver includes ataahci; ataahci.ko != ahci.ko, as you know). You may be able to adjust these using loader.conf variables: kern.cam.ada.default_timeout kern.cam.ada.retry_count I also imagine that hint.ahci.X.ccc might have some involvement here, but it's something I am not familiar with. mav@ would need to comment on this -- it's outside of my familiarity scope. Furthermore, in your case, your ada1 disk has serious CRC-related problems, and your ada0 disk has seen similar just at a much lower rate. ada1 should probably be replaced (along with cables, dusting out SATA ports, etc.), but keeping ada0 is probably fine. The statistics for these are shown in the "smartctl -l sataphy" output, field labelled ID 0x0001, "Command failed due to ICRC error". These are SATA-level problems or physical problems which will manifest themselves as anomalies during any kind of I/O. The counters shown in ID 0x000a and 0x0009 are completely fine; these don't indicate any problems. Your drives don't support GP log region 0x04, which is why "smartctl -l devstat" returns the errors it does. The errors you see coming from the kernel in this situation are 100% okay/acceptable; the drive itself is literally returning ABRT status to the inquiry submit to it. Different drives from different vendors behave differently in this regard. So, what I'm trying to say is, your problem looks different than the OPs. Let's not start a big "I have this problem too" thread; that has happened so many times over the years that when it happens I immediately bow out + stop participating in the thread. > smartctl -l sataphy /dev/ada0 > > SATA Phy Event Counters (GP Log 0x11) > ID Size Value Description > 0x000a 2 150 Device-to-host register FISes sent due to a COMRESET > 0x0001 23 Command failed due to ICRC error > 0x0009 2 173 Transition from drive PhyRdy to drive PhyNRdy > > smartctl -l sataphy /dev/ada1 > > SATA Phy Event Counters (GP Log 0x11) > ID Size Value Description > 0x000a 2 155 Device-to-host register FISes sent due to a COMRESET > 0x0001 265535+ Command failed due to ICRC error > 0x0009 2 178 Transition from drive PhyRdy to drive PhyNRdy -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Custom kernel poll summary
On 2/14/12 7:43 AM, Ian Smith wrote: On Tue, 14 Feb 2012 2:37:55 +0100, Alexander Leidinger wrote: > Here is what I got, the first column is the number of requests, the second > what is requested, and the 3rd my comments (basically it means, if there is a > comment, it is not needed/possible to include in a modular kernel): > ---snip--- [..] > 1 IPFIREWALL_FORWARD-> performance impact too big if unused (julian) well it's not that big but you will be running extra code for every packet unless you want it. when I made it an option but I was mainly trying to placate the "just say no" crowd. I perswonally wouldn't mind having it on by default in GENERIC, as long as we still make it an option so people who want every last drop of cpu can remove it. I expect Julian will object if I've mis-paraphrased or over-simplified something I recall him saying at least a couple of years ago :) [..] > 4 ALTQ* -> does add code to the pf module > other impact? ipfw(8) can also apply ALTQ tags, but relies on pfctl(8) to setup the queues - or so I read; I've not used it here. From altq(4): ALTQEnable ALTQ. ALTQ_CBQBuild the ``Class Based Queuing'' discipline. ALTQ_REDBuild the ``Random Early Detection'' extension. ALTQ_RIOBuild ``Random Early Drop'' for input and output. ALTQ_HFSC Build the ``Hierarchical Packet Scheduler'' discipline. ALTQ_CDNR Build the traffic conditioner. This option is meaningless at the moment as the conditioner is not used by any of the available disciplines or consumers. ALTQ_PRIQ Build the ``Priority Queuing'' discipline. ALTQ_NOPCC Required if the TSC is unusable. ALTQ_DEBUG Enable additional debugging facilities. Note that ALTQ-disciplines cannot be loaded as kernel modules. In order to use a certain discipline you have to build it into a custom kernel. The pf(4) interface, that is required for the configuration process of ALTQ can be loaded as a module. So which disciplines would one choose? Seeming an unlikely candidate? > 1 IPSTEALTH -> changes ipfw module only? I don't think this is specific to ipfw. From /sys/conf/NOTES: # IPSTEALTH enables code to support stealth forwarding (i.e., forwarding # packets without touching the TTL). This can be useful to hide firewalls # from traceroute and similar tools. But can it be disabled once added to kernel? It's no good as a default. > 1 IPFIREWALL_VERBOSE_LIMIT=5 -> changes ipfw module only? > loader tunable? > 1 IPFIREWALL_VERBOSE -> changes ipfw module only? > loader tunable? sysctl.conf: net.inet.ip.fw.verbose and net.inet.ip.fw.verbose_limit cheers, Ian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Why won't 8.2 umount -f?
Doug Barton wrote: > On 02/13/2012 19:13, Rick Macklem wrote: > > I just looked and at least some of the fixes were MFC'd to stable/8 > > about > > 8months ago. So, they aren't in 8.2, but will be in 8.3. > > Well 8.3 is about to enter code freeze, any way we can check to be > sure > all of the relevant fixes can be mfc'ed? > I took a look and they seem to have been MFC'd. rick > > Doug > > -- > > It's always a long day; 86400 doesn't fit into a short. > > Breadth of IT experience, and depth of knowledge in the DNS. > Yours for the right price. :) http://SupersetSolutions.com/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: sysutils/pftop on 9.x+
On 14.02.12 17:14, Fabian Keil wrote: > Greg Rivers wrote: > >> sysutils/pftop was marked broken on 9.x and above last March[1]. Are >> there any plans to fix it soon? It's a really handy utility. >> >> [1] >> http://www.freebsd.org/cgi/cvsweb.cgi/ports/sysutils/pftop/Makefile?rev=1.17 > > Please have a look at: > http://www.freebsd.org/cgi/query-pr.cgi?pr=155938 > > Note that the currently working fix is in the audit trail, > the original fix stopped working after the PF update. The PR was closed by mistake, I'll take care of it. Florian signature.asc Description: OpenPGP digital signature
Re: sysutils/pftop on 9.x+
Greg Rivers wrote: > sysutils/pftop was marked broken on 9.x and above last March[1]. Are > there any plans to fix it soon? It's a really handy utility. > > [1] > http://www.freebsd.org/cgi/cvsweb.cgi/ports/sysutils/pftop/Makefile?rev=1.17 Please have a look at: http://www.freebsd.org/cgi/query-pr.cgi?pr=155938 Note that the currently working fix is in the audit trail, the original fix stopped working after the PF update. Fabian signature.asc Description: PGP signature
Re: Custom kernel poll summary (was: Re: Reducing the need to compile a custom kernel)
On Tue, Feb 14, 2012 at 7:43 AM, Ian Smith wrote: > On Tue, 14 Feb 2012 2:37:55 +0100, Alexander Leidinger wrote: > > 1 IPSTEALTH -> changes ipfw module only? > > I don't think this is specific to ipfw. From /sys/conf/NOTES: > > # IPSTEALTH enables code to support stealth forwarding (i.e., forwarding > # packets without touching the TTL). This can be useful to hide firewalls > # from traceroute and similar tools. > > But can it be disabled once added to kernel? It's no good as a default. It's controllable via sysctl once it's compiled into the kernel. If it's not compiled into the kernel, then the sysctl doesn't exist. > > 1 IPFIREWALL_VERBOSE_LIMIT=5 -> changes ipfw module only? > > loader tunable? This is controllable via sysctl. Not sure if it needs to be compiled into the kernel before it's controllable via sysctl, though. We have compiled into all our firewall kernels (with a default of 1000), then change it via sysctl when needed. > > 1 IPFIREWALL_VERBOSE -> changes ipfw module only? > > loader tunable? > > sysctl.conf: net.inet.ip.fw.verbose and net.inet.ip.fw.verbose_limit Ah, you list the sysctls that control the last two. :) -- Freddie Cash fjwc...@gmail.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: problems with AHCI on FreeBSD 8.2
Hello, I have got a quite similar problem with AHCI on FreeBSD 8.2 and it still persists on FreeBSD 9.0 release. Switching from ahci to ataahci resolved the problem for me too. I'm using gmirror for swap, system is on a zpool and the problem first occurred during a zpool scrub, but it is easily reproducible with dd. The timeouts only occur when writing to disks, dd if=/dev/ada{0|1} of=/dev/null is not an issue. Sometimes I need to power off the server because after a reboot one disk is still missing. I really would like to help in this issue, so let me know if you need any more information. -- Claudius dmesg: --cut-- Jan 14 01:33:57 server kernel: ahcich0: Timeout on slot 7 port 0 Jan 14 01:33:57 server kernel: ahcich0: is cs 0080 ss rs 0080 tfd c0 serr cmd 0004c717 Jan 14 01:33:57 server kernel: ahcich1: Timeout on slot 31 port 0 Jan 14 01:33:57 server kernel: ahcich1: is cs 8000 ss rs 8000 tfd c0 serr cmd 0004df17 Jan 14 01:33:57 server kernel: ahcich0: Timeout on slot 7 port 0 Jan 14 01:33:57 server kernel: ahcich0: is cs f800 ss ff80 rs ff80 tfd c0 serr cmd 0004cb17 Jan 14 01:33:57 server kernel: ahcich1: Timeout on slot 31 port 0 Jan 14 01:33:57 server kernel: ahcich1: is cs 00f8 ss 80ff rs 80ff tfd c0 serr cmd 0004c317 Jan 14 01:33:57 server kernel: ahcich0: Timeout on slot 23 port 0 Jan 14 01:33:57 server kernel: ahcich0: is cs 0180 ss rs 0180 tfd c0 serr cmd 0004d717 Jan 14 01:33:57 server kernel: ahcich1: Timeout on slot 15 port 0 Jan 14 01:33:57 server kernel: ahcich1: is cs 00018000 ss rs 00018000 tfd c0 serr cmd 0004cf17 Jan 14 01:33:57 server kernel: ahcich1: Timeout on slot 17 port 0 Jan 14 01:33:57 server kernel: ahcich1: is cs 01f8 ss 01fe rs 01fe tfd c0 serr cmd 0004d317 Jan 14 01:33:57 server kernel: ahcich0: AHCI reset: device not ready after 31000ms (tfd = 0080) Jan 14 01:33:57 server kernel: ahcich1: Timeout on slot 31 port 0 Jan 14 01:33:57 server kernel: ahcich1: is cs 8000 ss rs 8000 tfd c0 serr cmd 0004df17 Jan 14 01:33:57 server kernel: ahcich0: Timeout on slot 24 port 0 --cut-- smartctl -a /dev/ada0 smartctl 5.42 2011-10-20 r3458 [FreeBSD 9.0-RELEASE amd64] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: SAMSUNG SpinPoint F1 DT Device Model: SAMSUNG HD753LJ Serial Number:S13UJDWS900110 LU WWN Device Id: 5 0024e9 0020d1bfa Firmware Version: 1AA01118 User Capacity:750,156,374,016 bytes [750 GB] Sector Size: 512 bytes logical/physical Device is:In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: ATA-8-ACS revision 3b Local Time is:Tue Feb 14 16:32:58 2012 CET SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection:( 9429) seconds. Offline data collection capabilities:(0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities:(0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability:(0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time:( 2) minutes. Extended self-test routine recommended polling time:( 158) minutes. Conveyance self-test routine recommended polling time:( 17) minutes. SCT capabilities: (0x003f) SCT Status supported. SCT Error Recovery Control supported.
Re: Custom kernel poll summary (was: Re: Reducing the need to compile a custom kernel)
On Tue, 14 Feb 2012 2:37:55 +0100, Alexander Leidinger wrote: > Here is what I got, the first column is the number of requests, the second > what is requested, and the 3rd my comments (basically it means, if there is a > comment, it is not needed/possible to include in a modular kernel): > ---snip--- [..] > 1 IPFIREWALL_FORWARD-> performance impact too big if unused > (julian) I expect Julian will object if I've mis-paraphrased or over-simplified something I recall him saying at least a couple of years ago :) [..] > 4 ALTQ* -> does add code to the pf module >other impact? ipfw(8) can also apply ALTQ tags, but relies on pfctl(8) to setup the queues - or so I read; I've not used it here. From altq(4): ALTQEnable ALTQ. ALTQ_CBQBuild the ``Class Based Queuing'' discipline. ALTQ_REDBuild the ``Random Early Detection'' extension. ALTQ_RIOBuild ``Random Early Drop'' for input and output. ALTQ_HFSC Build the ``Hierarchical Packet Scheduler'' discipline. ALTQ_CDNR Build the traffic conditioner. This option is meaningless at the moment as the conditioner is not used by any of the available disciplines or consumers. ALTQ_PRIQ Build the ``Priority Queuing'' discipline. ALTQ_NOPCC Required if the TSC is unusable. ALTQ_DEBUG Enable additional debugging facilities. Note that ALTQ-disciplines cannot be loaded as kernel modules. In order to use a certain discipline you have to build it into a custom kernel. The pf(4) interface, that is required for the configuration process of ALTQ can be loaded as a module. So which disciplines would one choose? Seeming an unlikely candidate? > 1 IPSTEALTH -> changes ipfw module only? I don't think this is specific to ipfw. From /sys/conf/NOTES: # IPSTEALTH enables code to support stealth forwarding (i.e., forwarding # packets without touching the TTL). This can be useful to hide firewalls # from traceroute and similar tools. But can it be disabled once added to kernel? It's no good as a default. > 1 IPFIREWALL_VERBOSE_LIMIT=5 -> changes ipfw module only? >loader tunable? > 1 IPFIREWALL_VERBOSE -> changes ipfw module only? >loader tunable? sysctl.conf: net.inet.ip.fw.verbose and net.inet.ip.fw.verbose_limit cheers, Ian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: problems with AHCI on FreeBSD 8.2
On Tue, Feb 14, 2012 at 06:16:01AM -0800, Jeremy Chadwick wrote: [..] > > Thanks. Both your drives look overall fine, sort-of. I'll outline my > concern points, and ask for some more info: > > * ada0 has 28 CRC errors, while ada1 has 2. These drives have been in > use for 4688 hours and 4583 hours (respectively), which is roughly 6 > months for each drive. CRC errors usually result in transparent > retransmits, but this can sometimes cause I/O delays (especially if the > CRC errors are repeated). > > If the timeout messages recur in the future, please run the commands I > gave you above once more and provide the output. I can then compare the > old to the new and see if there is anything of interest. I can force the error each time i want. Its 100% reproducible on my environment so i'll do the tests and send you smartctl -a output again. > > * Both drives had 2 long tests run on them a few days ago ("Extended > offline" tests). Did you induce these manually? If so, were these > tests running at the time you witnessed AHCI timeout errors on ada0? > Short, long, and selective surface scan tests are supposed to be > non-intrusive, but given the nature of the tests sometimes they can > stall the I/O subsystem. I've ran the tests, but they were not running during timeout problems. The only thing running on the disks was a newfs -J under a gjournal partiton. For the rest, they're mostly idle. > > If you do tests of this nature, you should write down the exact > dates/times when you ran them (at least from now on). > > If you didn't induce these, something must have, or possibly the drive > itself did it (and if that's the case, convenient that it induces an > entry in the self-test log!). > > I do have some familiarity with drives doing internal tests -- the best > example are old IBM Deskstar drives executing ADM on their own, > resulting in the drives spinning down and performing internal tests, > which would subsequently be interrupted by ATA I/O, drive spins back up, > etc. -- but took too long resulting in ATA timeouts on FreeBSD and > Linux. I mailed IBM about this back in 2000 and got confirmation of the > feature (which was also on their SCSI drives but defaulted to off); the > feature was mysteriously removed in future drive models and still > remains gone today: > > http://jdc.parodius.com/freebsd/ibm_email_aware_of_adm.txt > > I'm not saying your drives do this. I'm simply saying that if there is > some form of automated test that runs on these drives which is > transparent to the underlying ATA layer, then there is really nothing > you can do about it, and timeouts are possible. The IBM ADM issue was > only discovered after reviewing technical specifications/documentation > and compared to their SCSI drives. That's of course possible, but as the problem is 100% reproducible with AHCI driver and is not with ata driver, i guess this time is not drive's fault. We've also tested replacement disks and cables during the previous days. I guess the problem is in some bad interaction with AHCI driver. > > * Samsung has a notoriously bad reputation for firmware reliability on > their SpinPoint drives, but I haven't read of anything bad about the F2 > series, just the F1, F3, and F4 models. I have very little (almost > none) experience with these drives. I'm not boycotting their products, > but I wouldn't be surprised if the timeout errors you saw were caused by > something internal the drive was doing. There is absolutely zero > visibility into this kind of problem on any layer (even if you had an > ATA protocol analyser hooked up); you're completely at the mercy of the > firmware. Just something to keep in mind when working with ANY kind of > disk (MHDD, SSD, etc.). I've seen reports on freebsd lists and smartmontools wiki about firmware problems with F4 drives manufactured before december of 2010, but checking samsung's web page, seems this drives are not affected. I hope we're not hitting a new bug. More info: http://sourceforge.net/apps/trac/smartmontools/wiki/SamsungF4EGBadBlocks > > All that said, could you please provide output from the following > commands as well? These may return "not supported" errors, which is > acceptable, but we have to check. > > * smartctl -l devstat /dev/ada0 > * smartctl -l sataphy /dev/ada0 > * smartctl -l devstat /dev/ada1 > * smartctl -l sataphy /dev/ada1 > Thanks a lot for you help Jeremy. Attached is the output of the commands: fe09# smartctl -l devstat /dev/ada0 smartctl 5.42 2011-10-20 r3458 [FreeBSD 8.2-STABLE amd64] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net (pass0:ahcich0:0:0:0): READ_LOG_EXT. ACB: 2f 00 04 00 00 40 00 00 00 00 01 00 (pass0:ahcich0:0:0:0): CAM status: ATA Status Error (pass0:ahcich0:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 04 (ABRT ) (pass0:ahcich0:0:0:0): RES: 51 04 04 00 00 40 00 00 00 01 00 ATA_READ_LOG_EXT (addr=0x04:0x00, page=0, n=1) failed: Unknown
Re: siisch1: Error while READ LOG EXT
On Tue, Feb 14, 2012 at 09:30:29AM -0500, Mike Tancsa wrote: > On 2/10/2012 8:43 PM, Mike Tancsa wrote: > > On 2/10/2012 8:27 PM, Jeremy Chadwick wrote: > >> Mike, > >> > >> I wanted to make you aware of this commit that just came through: > >> > >> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cam/ata/ata_da.c > > > > Thanks, I did see that. I was going to wait until Monday to csup up > > once all the weekend level zeros are done. The prior kernels from Nov > > 28th never saw these READ LOG EXT errors on either of these 2 big zfs boxes > > > So far so good. Unfortunately, I had to make 2 changes to the box > showing the problem the most. I changed the cable (the new one does seem > to fit more snug) as well as updated the code. I havent done many level > 0 dumps to it (the real test will be the weekend), but so far so good. > On the other box that did show the same READ LOG EXT error, I also > updated the kernel, but made no hardware changes. It too has not yet > shown any errors since the upgrade. > > I changed the cable at 8am local time yesterday, and I take snapshots of > smartctl at 5am Cool. > I did see this error increase in 24hrs, but that was on a disk that was > off the motherboard. Perhaps a new cable for it too. > > < 0x000a 2 12 Device-to-host register FISes sent due to a > COMRESET > --- > > 0x000a 26 Device-to-host register FISes sent due to a > COMRESET This ID tracks the number of times an actual communication reset command was sent from the drive to the controller via a FIS packet. This is at the SATA layer, not the ATA command layer. It's completely normal/okay for a drive to have this number increase, especially if the machine is shut off, force-reset (via reset button), or in some cases simply soft rebooted. Nothing to worry about here; no need to adjust cables or otherwise. Values 6, 12, etc. are all perfectly reasonable and will vary from system to system based on use. -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: siisch1: Error while READ LOG EXT
On 2/10/2012 8:43 PM, Mike Tancsa wrote: > On 2/10/2012 8:27 PM, Jeremy Chadwick wrote: >> Mike, >> >> I wanted to make you aware of this commit that just came through: >> >> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cam/ata/ata_da.c > > Thanks, I did see that. I was going to wait until Monday to csup up > once all the weekend level zeros are done. The prior kernels from Nov > 28th never saw these READ LOG EXT errors on either of these 2 big zfs boxes So far so good. Unfortunately, I had to make 2 changes to the box showing the problem the most. I changed the cable (the new one does seem to fit more snug) as well as updated the code. I havent done many level 0 dumps to it (the real test will be the weekend), but so far so good. On the other box that did show the same READ LOG EXT error, I also updated the kernel, but made no hardware changes. It too has not yet shown any errors since the upgrade. I changed the cable at 8am local time yesterday, and I take snapshots of smartctl at 5am I did see this error increase in 24hrs, but that was on a disk that was off the motherboard. Perhaps a new cable for it too. < 0x000a 2 12 Device-to-host register FISes sent due to a COMRESET --- > 0x000a 26 Device-to-host register FISes sent due to a COMRESET ---Mike -- --- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, m...@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: problems with AHCI on FreeBSD 8.2
On Tue, Feb 14, 2012 at 02:54:35PM +0100, Victor Balada Diaz wrote: > On Tue, Feb 14, 2012 at 02:05:13AM -0800, Jeremy Chadwick wrote: > > On Tue, Feb 14, 2012 at 10:19:09AM +0100, Victor Balada Diaz wrote: > > > We're having some troubles with AHCI under FreeBSD 8.2 and 8-STABLE. The > > > error is: > > > > > > ahcich0: Timeout on slot 8 > > > ahcich0: is cs 0100 ss rs 0100 tfd c0 serr > > > > > > ahcich0: AHCI reset... > > > ahcich0: SATA connect time=0ms status=0123 > > > ahcich0: ready wait time=18ms > > > ahcich0: AHCI reset done: device found > > > (ada0:ahcich0:0:0:0): Request requeued > > > (ada0:ahcich0:0:0:0): Retrying command > > > (ada0:ahcich0:0:0:0): Command timed out > > > (ada0:ahcich0:0:0:0): Retrying command > > > ahcich0: Timeout on slot 8 > > > ahcich0: is cs 007ff000 ss 007fff00 rs 007fff00 tfd c0 serr > > > > > > ahcich0: AHCI reset... > > > ahcich0: SATA connect time=0ms status=0123 > > > ahcich0: ready wait time=84ms > > > ahcich0: AHCI reset done: device found > > > (ada0:ahcich0:0:0:0): Request requeued > > > (ada0:ahcich0:0:0:0): Retrying command > > > (ada0:ahcich0:0:0:0): Command timed out > > > (ada0:ahcich0:0:0:0): Retrying command > > > (ada0:ahcich0:0:0:0): Request requeued > > > [...] > > > > > > If we use old ATA driver we have no problems. If we just use the first > > > disk (ada0) with ahci, > > > no problems either. If we use both disks (ada0 and ada1) in gmirror setup > > > with ahci, we > > > got the above error. If we use both disks in gmirror with old ata driver, > > > no problems. > > > > Please provide SMART statistics for both disks by installing > > ports/sysutils/smartmontools (5.42 or newer please) and running > > "smartctl -a" against both disks (ada0/ada1, or ad4/ad10 -- doesn't > > matter which driver you're using). I will review the output. > > Just forgot to say that from time to time, after system hangs and i need > to reboot, one of the disks is lost. It doesn't even show after a few reboots, > nor on Linux live system. > > You can see smartctl output here: > > ada0: > > # smartctl -a /dev/ada0 > smartctl 5.42 2011-10-20 r3458 [FreeBSD 8.2-STABLE amd64] (local build) > Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net > > === START OF INFORMATION SECTION === > Model Family: SAMSUNG SpinPoint F2 EG > Device Model: SAMSUNG HD154UI > Serial Number:S24EJ9BB200080 > LU WWN Device Id: 5 0024e9 2047cb78f > Firmware Version: 1AG01118 > User Capacity:1,500,301,910,016 bytes [1.50 TB] > Sector Size: 512 bytes logical/physical > Device is:In smartctl database [for details use: -P show] > ATA Version is: 8 > ATA Standard is: ATA-8-ACS revision 3b > Local Time is:Tue Feb 14 13:51:18 2012 CET > SMART support is: Available - device has SMART capability. > SMART support is: Enabled > > === START OF READ SMART DATA SECTION === > SMART overall-health self-assessment test result: PASSED > > General SMART Values: > Offline data collection status: (0x00) Offline data collection activity > was never started. > Auto Offline Data Collection: > Disabled. > Self-test execution status: ( 0) The previous self-test routine > completed > without error or no self-test has > ever > been run. > Total time to complete Offline > data collection:(18863) seconds. > Offline data collection > capabilities:(0x7b) SMART execute Offline immediate. > Auto Offline data collection on/off > support. > Suspend Offline collection upon new > command. > Offline surface scan supported. > Self-test supported. > Conveyance Self-test supported. > Selective Self-test supported. > SMART capabilities:(0x0003) Saves SMART data before entering > power-saving mode. > Supports SMART auto save timer. > Error logging capability:(0x01) Error logging supported. > General Purpose Logging supported. > Short self-test routine > recommended polling time:( 2) minutes. > Extended self-test routine > recommended polling time:( 255) minutes. > Conveyance self-test routine > recommended polling time:( 33) minutes. > SCT capabilities: (0x003f) SCT Status supported. > SCT Error Recovery Control supported. > SCT Feature Control supported. >
Re: sysutils/pftop on 9.x+
Le Mon, 13 Feb 2012 14:09:25 -0600 (CST), Greg Rivers a écrit : > sysutils/pftop was marked broken on 9.x and above last March[1]. Are > there any plans to fix it soon? It's a really handy utility. > > [1] > http://www.freebsd.org/cgi/cvsweb.cgi/ports/sysutils/pftop/Makefile?rev=1.17 Looks like there are some patches to make it works with DragonFlyBSD/NetBSD in pkgsrc. Don't have the time to try... http://pkgsrc.se/sysutils/pftop HTH Regards. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: problems with AHCI on FreeBSD 8.2
On Tue, Feb 14, 2012 at 02:05:13AM -0800, Jeremy Chadwick wrote: > On Tue, Feb 14, 2012 at 10:19:09AM +0100, Victor Balada Diaz wrote: > > We're having some troubles with AHCI under FreeBSD 8.2 and 8-STABLE. The > > error is: > > > > ahcich0: Timeout on slot 8 > > ahcich0: is cs 0100 ss rs 0100 tfd c0 serr > > > > ahcich0: AHCI reset... > > ahcich0: SATA connect time=0ms status=0123 > > ahcich0: ready wait time=18ms > > ahcich0: AHCI reset done: device found > > (ada0:ahcich0:0:0:0): Request requeued > > (ada0:ahcich0:0:0:0): Retrying command > > (ada0:ahcich0:0:0:0): Command timed out > > (ada0:ahcich0:0:0:0): Retrying command > > ahcich0: Timeout on slot 8 > > ahcich0: is cs 007ff000 ss 007fff00 rs 007fff00 tfd c0 serr > > > > ahcich0: AHCI reset... > > ahcich0: SATA connect time=0ms status=0123 > > ahcich0: ready wait time=84ms > > ahcich0: AHCI reset done: device found > > (ada0:ahcich0:0:0:0): Request requeued > > (ada0:ahcich0:0:0:0): Retrying command > > (ada0:ahcich0:0:0:0): Command timed out > > (ada0:ahcich0:0:0:0): Retrying command > > (ada0:ahcich0:0:0:0): Request requeued > > [...] > > > > If we use old ATA driver we have no problems. If we just use the first disk > > (ada0) with ahci, > > no problems either. If we use both disks (ada0 and ada1) in gmirror setup > > with ahci, we > > got the above error. If we use both disks in gmirror with old ata driver, > > no problems. > > Please provide SMART statistics for both disks by installing > ports/sysutils/smartmontools (5.42 or newer please) and running > "smartctl -a" against both disks (ada0/ada1, or ad4/ad10 -- doesn't > matter which driver you're using). I will review the output. Just forgot to say that from time to time, after system hangs and i need to reboot, one of the disks is lost. It doesn't even show after a few reboots, nor on Linux live system. You can see smartctl output here: ada0: # smartctl -a /dev/ada0 smartctl 5.42 2011-10-20 r3458 [FreeBSD 8.2-STABLE amd64] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: SAMSUNG SpinPoint F2 EG Device Model: SAMSUNG HD154UI Serial Number:S24EJ9BB200080 LU WWN Device Id: 5 0024e9 2047cb78f Firmware Version: 1AG01118 User Capacity:1,500,301,910,016 bytes [1.50 TB] Sector Size: 512 bytes logical/physical Device is:In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: ATA-8-ACS revision 3b Local Time is:Tue Feb 14 13:51:18 2012 CET SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection:(18863) seconds. Offline data collection capabilities:(0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities:(0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability:(0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time:( 2) minutes. Extended self-test routine recommended polling time:( 255) minutes. Conveyance self-test routine recommended polling time:( 33) minutes. SCT capabilities: (0x003f) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f
Re: Custom kernel poll summary (was: Re: Reducing the need to compile a custom kernel)
Quoting Attilio Rao (from Tue, 14 Feb 2012 12:38:17 +): 2012/2/14, Alexander Leidinger : 2 SW_WATCHDOG This can become a module with very little effort I guess. What's the TODO list for this? Bye, Alexander. -- No man is lonely while eating spaghetti. http://www.Leidinger.netAlexander @ Leidinger.net: PGP ID = B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Reducing the need to compile a custom kernel
On Tue, February 14, 2012 08:31, Alexander Leidinger wrote: > Quoting Paul Schenkeveld (from Fri, 10 Feb 2012 > 15:44:50 +0100): > >> On Fri, Feb 10, 2012 at 02:56:04PM +0100, Alexander Leidinger wrote: >>> Hi, >>> >>> during some big discussions in the last monts on various lists, one of >>> the problems was that some people would like to use freebsd-update but >>> can't as they are using a custom kernel. With all the kernel modules >>> we provide, the need for a custom kernel should be small, but on the >>> other hand, we do not provide a small kernel-skeleton where you can >>> load just the modules you need. >>> >>> This should be easy to change. As a first step I took the generic >>> kernel and removed all devices which are available as modules, e.g. >>> the USB section consists now only of the USB_DEBUG option (so that the >>> module is build like with the current generic kernel). I also removed >>> some storage drivers which are not available as a module. The >>> rationale is, that I can not remove CAM from the kernel config if I >>> let those drivers inside (if those drivers are important enough, >>> someone will probably fix the problem and add the missing pieces to >>> generate a module). >>> >>> Such a kernel would cover situations where people compile their own >>> kernel because they want to get rid of some unused kernel code (and >>> maybe even need the memory this frees up). >>> >>> The question is, is this enough? Or asked differently, why are you >>> compiling a custom kernel in a production environment (so I rule out >>> debug options zhich are not enabled in GENERIC)? Are there options >>> which you add which you can not add as a module (SW_WATCHDOG comes to >>> my mind)? If yes, which ones and how important are they for you? >> >> - INET without INET6 >> - SOFTUPDATES, UFS_ACL, AUDIT, SCTP (left out for embedded devices) >> - Björn may add INET6 without INET >> - SCHED_ULE vs. SCHED_4BSD >> - No vga console/atkbd/psm for embedded devices >> - CPU_SOEKRIS, CPU_GEODE, CPU_ELAN, NO_SWAPPING for embedded devices > > Embedded devices are out of the scope of this, normally you do a lot > of other modifictions to such systems anyway, so a custom kernel > should be not a big problem. > > I will also not touch the dual-stack part of the kernel config (it > shall still allow the generic purpose computing like the GERNERIC > config). I'm really curious why, if they are the piece of hardware that usually are worse to compile things on, for access issues to poor hardware (great to compile kernel+world on i7, pain to do so in my net5501-70). its a bummer to hear this :( matheus >> - IPSTEALTH, IPSEC, IPSEC_FILTERTUNNEL, IPFILTER, ALTQ for firewalls > > Request noted. > >> I also always specify exactly one CPU type (on i386), know it made a >> difference in the 386/486/586 era but am not sure how much difference >> it makes nowadays. > > The 386 part (which we do not have anymore in GENERIC) made a > difference, the rest doesn't hurt in the kernel. > > Bye, > Alexander. > > -- > Smuggling... It's not just a job, it's an adventure! > -- paid for by your local Colombian recruiting office > > http://www.Leidinger.netAlexander @ Leidinger.net: PGP ID = B0063FE7 > http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137 > > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" > -- We will call you Cygnus, The God of balance you shall be A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? http://en.wikipedia.org/wiki/Posting_style ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Custom kernel poll summary (was: Re: Reducing the need to compile a custom kernel)
2012/2/14, Alexander Leidinger : > Quoting Alexander Leidinger (from Fri, 10 > Feb 2012 14:56:04 +0100): > >> Such a kernel would cover situations where people compile their own >> kernel because they want to get rid of some unused kernel code (and >> maybe even need the memory this frees up). >> >> The question is, is this enough? Or asked differently, why are you >> compiling a custom kernel in a production environment (so I rule out >> debug options zhich are not enabled in GENERIC)? Are there options >> which you add which you can not add as a module (SW_WATCHDOG comes >> to my mind)? If yes, which ones and how important are they for you? > > Here is what I got, the first column is the number of requests, the > second what is requested, and the 3rd my comments (basically it means, > if there is a comment, it is not needed/possible to include in a > modular kernel): ... > 2 SW_WATCHDOG This can become a module with very little effort I guess. Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
[releng_9 tinderbox] failure on sparc64/sparc64
TB --- 2012-02-14 10:58:05 - tinderbox 2.9 running on freebsd-stable.sentex.ca TB --- 2012-02-14 10:58:05 - starting RELENG_9 tinderbox run for sparc64/sparc64 TB --- 2012-02-14 10:58:05 - cleaning the object tree TB --- 2012-02-14 10:58:05 - cvsupping the source tree TB --- 2012-02-14 10:58:05 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup.sentex.ca /tinderbox/RELENG_9/sparc64/sparc64/supfile TB --- 2012-02-14 10:58:44 - building world TB --- 2012-02-14 10:58:44 - CROSS_BUILD_TESTING=YES TB --- 2012-02-14 10:58:44 - MAKEOBJDIRPREFIX=/obj TB --- 2012-02-14 10:58:44 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2012-02-14 10:58:44 - SRCCONF=/dev/null TB --- 2012-02-14 10:58:44 - TARGET=sparc64 TB --- 2012-02-14 10:58:44 - TARGET_ARCH=sparc64 TB --- 2012-02-14 10:58:44 - TZ=UTC TB --- 2012-02-14 10:58:44 - __MAKE_CONF=/dev/null TB --- 2012-02-14 10:58:44 - cd /src TB --- 2012-02-14 10:58:44 - /usr/bin/make -B buildworld >>> World build started on Tue Feb 14 10:58:46 UTC 2012 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything >>> World build completed on Tue Feb 14 12:05:49 UTC 2012 TB --- 2012-02-14 12:05:49 - generating LINT kernel config TB --- 2012-02-14 12:05:49 - cd /src/sys/sparc64/conf TB --- 2012-02-14 12:05:49 - /usr/bin/make -B LINT TB --- 2012-02-14 12:05:49 - cd /src/sys/sparc64/conf TB --- 2012-02-14 12:05:49 - /usr/sbin/config -m LINT TB --- 2012-02-14 12:05:49 - building LINT kernel TB --- 2012-02-14 12:05:49 - CROSS_BUILD_TESTING=YES TB --- 2012-02-14 12:05:49 - MAKEOBJDIRPREFIX=/obj TB --- 2012-02-14 12:05:49 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2012-02-14 12:05:49 - SRCCONF=/dev/null TB --- 2012-02-14 12:05:49 - TARGET=sparc64 TB --- 2012-02-14 12:05:49 - TARGET_ARCH=sparc64 TB --- 2012-02-14 12:05:49 - TZ=UTC TB --- 2012-02-14 12:05:49 - __MAKE_CONF=/dev/null TB --- 2012-02-14 12:05:49 - cd /src TB --- 2012-02-14 12:05:49 - /usr/bin/make -B buildkernel KERNCONF=LINT >>> Kernel build for LINT started on Tue Feb 14 12:05:49 UTC 2012 >>> stage 1: configuring the kernel >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3.1: making dependencies [...] /usr/bin/make -V CFILES -V SYSTEM_CFILES -V GEN_CFILES | MKDEP_CPP="cc -E" CC="cc" xargs mkdep -a -f .newdep -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wmissing-include-dirs -fdiagnostics-show-option -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ipfilter -I/src/sys/contrib/pf -I/src/sys/dev/ath -I/src/sys/dev/ath/ath_hal -I/src/sys/contrib/ngatm -I/src/sys/dev/twa -I/src/sys/gnu/fs/xfs/FreeBSD -I/src/sys/gnu/fs/xfs/FreeBSD/support -I/src/sys/gnu/fs/xfs -I/src/sys/dev/cxgb -I/src/sys/dev/cxgbe -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -mcmodel=medany -msoft-float -ffreestanding -fstack-protector cc: /src/sys/dev/oce/oce_hw.c: No such file or directory cc: /src/sys/dev/oce/oce_if.c: No such file or directory cc: /src/sys/dev/oce/oce_mbox.c: No such file or directory cc: /src/sys/dev/oce/oce_queue.c: No such file or directory cc: /src/sys/dev/oce/oce_sysctl.c: No such file or directory cc: /src/sys/dev/oce/oce_util.c: No such file or directory mkdep: compile failed *** Error code 1 Stop in /obj/sparc64.sparc64/src/sys/LINT. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2012-02-14 12:07:30 - WARNING: /usr/bin/make returned exit code 1 TB --- 2012-02-14 12:07:30 - ERROR: failed to build LINT kernel TB --- 2012-02-14 12:07:30 - 2961.26 user 499.21 system 4165.45 real http://tinderbox.freebsd.org/tinderbox-releng_9-RELENG_9-sparc64-sparc64.full ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Custom kernel poll summary (was: Re: Reducing the need to compile a custom kernel)
Quoting Alexander Leidinger (from Fri, 10 Feb 2012 14:56:04 +0100): Such a kernel would cover situations where people compile their own kernel because they want to get rid of some unused kernel code (and maybe even need the memory this frees up). The question is, is this enough? Or asked differently, why are you compiling a custom kernel in a production environment (so I rule out debug options zhich are not enabled in GENERIC)? Are there options which you add which you can not add as a module (SW_WATCHDOG comes to my mind)? If yes, which ones and how important are they for you? Here is what I got, the first column is the number of requests, the second what is requested, and the 3rd my comments (basically it means, if there is a comment, it is not needed/possible to include in a modular kernel): ---snip--- 5 IPSEC 4 ALTQ 2 VIMAGE-> not production ready (bz) 2 SW_WATCHDOG 2 IPSEC_FILTERTUNNEL-> obsolete according to bz 2 IPFIREWALL_DEFAULT_TO_ACCEPT -> loader.conf: net.inet.ip.fw.default_to_accept 2 IPFIREWALL-> loader.conf: ipfw_load='YES' 2 HZ=1000 -> loader.conf: kern.hz 2 DEVICE_POLLING-> ifconfig in 9.0 handles this at runtime? 1 enc 1 ZERO_COPY_SOCKETS -> has known problems? can't find the reference, but I removed it from my kernels 1 SC_* options -> not a generic setting, will not include 1 ROUTETABLES=n -> bz is working on this 1 QUOTA 1 PF-> loader.conf: pf_load='YES' 1 MROUTING -> loader.conf: ip_mroute='YES'? 1 KTR -> rare use case, kernel recompile is OK 1 KDTRACE_HOOKS -> legal review needed 1 KDB_UNATTENDED-> re@ wants this, but has reservations 1 KDB_TRACE -> re@ wants this, but has reservations 1 KDB -> re@ wants this, but has reservations 1 IPSTEALTH 1 IPSEC_NAT_T 1 IPFIREWALL_VERBOSE_LIMIT=5 1 IPFIREWALL_VERBOSE 1 IPFIREWALL_FORWARD-> performance impact too big if unused (julian) 1 IPFILTER -> 2/3 firewalls can be loaded... and this one is not really maintained anymore 1 IPDIVERT -> loader.conf: ipdivert_load='YES' 1 GDB 1 FLOWTABLE 1 DUMMYNET -> loader.conf: dummynet_load='YES' 1 DIRECTIO 1 DDB_NUMSYM 1 DDB 1 BREAK_TO_DEBUGGER -> loader.conf: debug.kdb.break_to_debugger 1 BPF_JITTER 1 ALT_BREAK_TO_DEBUGGER -> loader.conf: debug.kdb.alt_break_to_debugger ---snip--- Yes, this poll is not representative... So... what's the impact of including the following options into a kernel which is intended to be modular, respectively are there reasons to _not_ include one of the following? ---snip--- 5 IPSEC -> we do not have a separate cryto dist, so it should be possible to include in a kernel now... legal advise needed 4 ALTQ* -> does add code to the pf module other impact? 2 SW_WATCHDOG-> should not hurt if not enabled in rc.conf 1 enc-> together with IPSEC 1 IPSTEALTH -> changes ipfw module only? 1 IPSEC_NAT_T 1 IPFIREWALL_VERBOSE_LIMIT=5 -> changes ipfw module only? loader tunable? 1 IPFIREWALL_VERBOSE -> changes ipfw module only? loader tunable? 1 FLOWTABLE 1 DIRECTIO 1 BPF_JITTER ---snip--- Bye, Alexander. -- Q: What is purple and concord the world? A: Alexander the Grape. http://www.Leidinger.netAlexander @ Leidinger.net: PGP ID = B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: dhclient script adjustments
Jason Hellenthal writes: > After recent merges to stable/8 I am now seeing errors on bootup of > the following for three interfaces that will never see the light of > DHCP. ? > > /etc/rc.d/dhclient: ERROR: 'dc1' is not a DHCP-enabled interface This is perfectly harmless. Just ignore these messages. They will go away as soon as r230388 is MFCed. DES -- Dag-Erling Smørgrav - d...@des.no ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Reducing the need to compile a custom kernel
Quoting Paul Schenkeveld (from Fri, 10 Feb 2012 15:44:50 +0100): On Fri, Feb 10, 2012 at 02:56:04PM +0100, Alexander Leidinger wrote: Hi, during some big discussions in the last monts on various lists, one of the problems was that some people would like to use freebsd-update but can't as they are using a custom kernel. With all the kernel modules we provide, the need for a custom kernel should be small, but on the other hand, we do not provide a small kernel-skeleton where you can load just the modules you need. This should be easy to change. As a first step I took the generic kernel and removed all devices which are available as modules, e.g. the USB section consists now only of the USB_DEBUG option (so that the module is build like with the current generic kernel). I also removed some storage drivers which are not available as a module. The rationale is, that I can not remove CAM from the kernel config if I let those drivers inside (if those drivers are important enough, someone will probably fix the problem and add the missing pieces to generate a module). Such a kernel would cover situations where people compile their own kernel because they want to get rid of some unused kernel code (and maybe even need the memory this frees up). The question is, is this enough? Or asked differently, why are you compiling a custom kernel in a production environment (so I rule out debug options zhich are not enabled in GENERIC)? Are there options which you add which you can not add as a module (SW_WATCHDOG comes to my mind)? If yes, which ones and how important are they for you? - INET without INET6 - SOFTUPDATES, UFS_ACL, AUDIT, SCTP (left out for embedded devices) - Björn may add INET6 without INET - SCHED_ULE vs. SCHED_4BSD - No vga console/atkbd/psm for embedded devices - CPU_SOEKRIS, CPU_GEODE, CPU_ELAN, NO_SWAPPING for embedded devices Embedded devices are out of the scope of this, normally you do a lot of other modifictions to such systems anyway, so a custom kernel should be not a big problem. I will also not touch the dual-stack part of the kernel config (it shall still allow the generic purpose computing like the GERNERIC config). - IPSTEALTH, IPSEC, IPSEC_FILTERTUNNEL, IPFILTER, ALTQ for firewalls Request noted. I also always specify exactly one CPU type (on i386), know it made a difference in the 386/486/586 era but am not sure how much difference it makes nowadays. The 386 part (which we do not have anymore in GENERIC) made a difference, the rest doesn't hurt in the kernel. Bye, Alexander. -- Smuggling... It's not just a job, it's an adventure! -- paid for by your local Colombian recruiting office http://www.Leidinger.netAlexander @ Leidinger.net: PGP ID = B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: problems with AHCI on FreeBSD 8.2
On Tue, Feb 14, 2012 at 10:19:09AM +0100, Victor Balada Diaz wrote: > We're having some troubles with AHCI under FreeBSD 8.2 and 8-STABLE. The > error is: > > ahcich0: Timeout on slot 8 > ahcich0: is cs 0100 ss rs 0100 tfd c0 serr > ahcich0: AHCI reset... > ahcich0: SATA connect time=0ms status=0123 > ahcich0: ready wait time=18ms > ahcich0: AHCI reset done: device found > (ada0:ahcich0:0:0:0): Request requeued > (ada0:ahcich0:0:0:0): Retrying command > (ada0:ahcich0:0:0:0): Command timed out > (ada0:ahcich0:0:0:0): Retrying command > ahcich0: Timeout on slot 8 > ahcich0: is cs 007ff000 ss 007fff00 rs 007fff00 tfd c0 serr > ahcich0: AHCI reset... > ahcich0: SATA connect time=0ms status=0123 > ahcich0: ready wait time=84ms > ahcich0: AHCI reset done: device found > (ada0:ahcich0:0:0:0): Request requeued > (ada0:ahcich0:0:0:0): Retrying command > (ada0:ahcich0:0:0:0): Command timed out > (ada0:ahcich0:0:0:0): Retrying command > (ada0:ahcich0:0:0:0): Request requeued > [...] > > If we use old ATA driver we have no problems. If we just use the first disk > (ada0) with ahci, > no problems either. If we use both disks (ada0 and ada1) in gmirror setup > with ahci, we > got the above error. If we use both disks in gmirror with old ata driver, no > problems. Please provide SMART statistics for both disks by installing ports/sysutils/smartmontools (5.42 or newer please) and running "smartctl -a" against both disks (ada0/ada1, or ad4/ad10 -- doesn't matter which driver you're using). I will review the output. -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: problems with AHCI on FreeBSD 8.2
On 02/14/12 11:19, Victor Balada Diaz wrote: We're having some troubles with AHCI under FreeBSD 8.2 and 8-STABLE. The error is: ahcich0: Timeout on slot 8 ahcich0: is cs 0100 ss rs 0100 tfd c0 serr ahcich0: AHCI reset... ahcich0: SATA connect time=0ms status=0123 ahcich0: ready wait time=18ms ahcich0: AHCI reset done: device found (ada0:ahcich0:0:0:0): Request requeued (ada0:ahcich0:0:0:0): Retrying command (ada0:ahcich0:0:0:0): Command timed out (ada0:ahcich0:0:0:0): Retrying command ahcich0: Timeout on slot 8 ahcich0: is cs 007ff000 ss 007fff00 rs 007fff00 tfd c0 serr ahcich0: AHCI reset... ahcich0: SATA connect time=0ms status=0123 ahcich0: ready wait time=84ms ahcich0: AHCI reset done: device found (ada0:ahcich0:0:0:0): Request requeued (ada0:ahcich0:0:0:0): Retrying command (ada0:ahcich0:0:0:0): Command timed out (ada0:ahcich0:0:0:0): Retrying command (ada0:ahcich0:0:0:0): Request requeued [...] If we use old ATA driver we have no problems. If we just use the first disk (ada0) with ahci, no problems either. If we use both disks (ada0 and ada1) in gmirror setup with ahci, we got the above error. If we use both disks in gmirror with old ata driver, no problems. In both cases controller reports command status as 0xc0, that means device is busy with the command. For NCQ commands it means that device in in stage of processing command itself, not a head positioning or data transfer. Enabling AHCI enables NCQ for the devices. That increases load on both devices and the controller, and it is difficult to say who's fault is here. SAMSUNG HD154UI disks AFAIR have 4k sectors that may have big performance penalties when accessing small/misaligned data. I am not sure how big that penalty can be in the worst case, especially since disks by default cache writes, hiding the real load level. Relations with gmirror is harder to explain. Depending on how you created it and partitions it could cause more misaligned I/Os during rebuild. Using gmirror also double concurrent load on the controller, but at this point I have nothing to blame it for. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Reducing the need to compile a custom kernel
On Sun, Feb 12, 2012 at 8:52 AM, Ian Smith wrote: > On Fri, 10 Feb 2012 16:12:00 +, Bjoern A. Zeeb wrote: > > > IPFIREWALL_FORWARD > > Unless something's changed, julian@ has pointed out (paraphrasing) that > this adds bits of code to various parts of the stack and was thought to > impact performance too much when unused to conditionalise each instance. > > I'm unsure if this is the only case ipfw still needs building in kernel? If something's changed, I'd really love to hear it. IPFIREWALL_FORWARD is the most common reason I need a custom kernel (usually to solve the issues around asymmetric/source-based policy routing on multihomed hosts). Really miss Linux' "ip rule... table" functionality. Regards, -- Nino ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: dhclient script adjustments
On Tue, Feb 14, 2012 at 02:47:00AM -0500, Jason Hellenthal wrote: > > Anyone ? > Sorry for mess, I'm working on this to figure out why it does that. Thanks for reporting, regards, Bapt pgp2hdjzuz7zb.pgp Description: PGP signature
Re: Regression in 8.2-STABLE bge code (from 7.4-STABLE)
On Sat, Jan 28, 2012 at 09:24:53PM -0500, Michael L. Squires wrote: Sorry for late reply. Had been busy due to relocation. > There is a bug in the Tyan S4881/S4882 PCI-X bridges that was fixed with a > patch in 7.x (thank you very much). This patch is not present in the > 8.2-STABLE code and the symptoms (watchdog timeouts) have recurred. > Hmm, I thought the mailbox reordering bug was avoided by limiting DMA address space to 32bits but it seems it was not right workaround for AMD 8131 PCI-X Bridge. > The watchdog timeouts do not appear to be present after I switched to an > Intel gigabit PCI-X card. > > I did a brute-force patch of the 8.2-STABLE bge code using the patches for > 7.4-STABLE; the resulting code compiled and, other than odd behavior at > startup, seems to be working normally. > > This is using FreeBSD 8.2-STABLE amd64; I don't know what happens with > i386. > > Given the age of the boards it may be easier if I just continue using the > Intel gigabit card but am happy to test anything that comes my way. > Try attached patch and let me know how it goes. I didn't enable 64bit DMA addressing though. I think the AMD-8131 PCI-X bridge needs both workarounds. > Thanks, > > Mike Squires > mikes at siralan.org Index: sys/dev/bge/if_bgereg.h === --- sys/dev/bge/if_bgereg.h (revision 231621) +++ sys/dev/bge/if_bgereg.h (working copy) @@ -2828,6 +2828,7 @@ #define BGE_FLAG_RX_ALIGNBUG 0x0400 #define BGE_FLAG_SHORT_DMA_BUG 0x0800 #define BGE_FLAG_4K_RDMA_BUG 0x1000 +#define BGE_FLAG_MBOX_REORDER 0x2000 uint32_t bge_phy_flags; #define BGE_PHY_NO_WIRESPEED 0x0001 #define BGE_PHY_ADC_BUG 0x0002 Index: sys/dev/bge/if_bge.c === --- sys/dev/bge/if_bge.c (revision 231621) +++ sys/dev/bge/if_bge.c (working copy) @@ -380,6 +380,8 @@ static int bge_dma_ring_alloc(struct bge_softc *, bus_size_t, bus_size_t, bus_dma_tag_t *, uint8_t **, bus_dmamap_t *, bus_addr_t *, const char *); +static int bge_mbox_reorder(struct bge_softc *); + static int bge_get_eaddr_fw(struct bge_softc *sc, uint8_t ether_addr[]); static int bge_get_eaddr_mem(struct bge_softc *, uint8_t[]); static int bge_get_eaddr_nvram(struct bge_softc *, uint8_t[]); @@ -635,6 +637,8 @@ off += BGE_LPMBX_IRQ0_HI - BGE_MBX_IRQ0_HI; CSR_WRITE_4(sc, off, val); + if ((sc->bge_flags & BGE_FLAG_MBOX_REORDER) != 0) + CSR_READ_4(sc, off); } /* @@ -2609,8 +2613,8 @@ * XXX * watchdog timeout issue was observed on BCM5704 which * lives behind PCI-X bridge(e.g AMD 8131 PCI-X bridge). - * Limiting DMA address space to 32bits seems to address - * it. + * Both limiting DMA address space to 32bits and flushing + * mailbox write seem to address the issue. */ if (sc->bge_flags & BGE_FLAG_PCIX) lowaddr = BUS_SPACE_MAXADDR_32BIT; @@ -2775,6 +2779,42 @@ } static int +bge_mbox_reorder(struct bge_softc *sc) +{ + /* Lists of PCI bridges that are known to reorder mailbox writes. */ + static const struct mbox_reorder { + const uint16_t vendor; + const uint16_t device; + const char *desc; + } const mbox_reorder_lists[] = { + { 0x1022, 0x7450, "AMD-8131 PCI-X Bridge" }, + }; + devclass_t pcib; + device_t dev; + int i, count, unit; + + count = sizeof(mbox_reorder_lists) / sizeof(mbox_reorder_lists[0]); + pcib = devclass_find("pcib"); + for (unit = 0; unit < devclass_get_maxunit(pcib); unit++) { + dev = devclass_get_device(pcib, unit); + if (dev == NULL) +continue; + for (i = 0; i < count; i++) { + if (pci_get_vendor(dev) == + mbox_reorder_lists[i].vendor && + pci_get_device(dev) == + mbox_reorder_lists[i].device) { +device_printf(sc->bge_dev, +"enabling MBOX workaround for %s\n", +mbox_reorder_lists[i].desc); +return (1); + } + } + } + return (0); +} + +static int bge_attach(device_t dev) { struct ifnet *ifp; @@ -3094,6 +3134,14 @@ if (BGE_IS_5714_FAMILY(sc) && (sc->bge_flags & BGE_FLAG_PCIX)) sc->bge_flags |= BGE_FLAG_40BIT_BUG; /* + * Some PCI-X bridges are known to trigger write reordering to + * the mailbox registers. Typical phenomena is watchdog timeouts + * caused by out-of-order TX completions. Enable workaround for + * PCI-X devices that live behind these bridges. + */ + if (sc->bge_flags & BGE_FLAG_PCIX && bge_mbox_reorder(sc) != 0) + sc->bge_flags |= BGE_FLAG_MBOX_REORDER; + /* * Allocate the interrupt, using MSI if possible. These devices * support 8 MSI messages, but only the first one is used in * normal operation. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
freebsd 9-stable TOP problem from around Jan 10
Has anyone else seen a problem with top -H -S? after a short while the screen gets more and more corrupted.. hitting ^L or turning off S & H modes helps .. for a while. If this is a known fixed problem, let me know but I need to co-ordinate with others to upgrade the machine in question. Julian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Reducing the need to compile a custom kernel
Quoting Volodymyr Kostyrko (from Mon, 13 Feb 2012 17:44:33 +0200): Alexander Leidinger wrote: Feasible: depend upon your definition of "feasible". You would have to add all keymaps statically into the kernel. No idea which parts exactly we talk about, but: ---snip--- % du -h /usr/share/syscons/ 40k /usr/share/syscons/scrnmaps 570k /usr/share/syscons/fonts 1.1M /usr/share/syscons/keymaps 1.8M /usr/share/syscons/ ---snip--- I wouldn't mind for 40k, but 1.8M looks more like the value to calculate with. Anyway, this is out of the scope of the original question. Correct me if I'm wrong but zfs already fetches plain file /boot/zfs/zpool.cache on load. Can't this be: 1. Postponed to later processing. 2. After filesystems are mounted the keymap is loaded. This is already the case. you can set the keymap in rc.conf. Or even: 1. Put all viable files on the / partition. 2. Select and load correct one before kernel is fired. This is not the same as compiling it in the kernel. Think about a problem where parts of your FS are corrupt / damaged / overwritten with nonsense. Yes you can minimize the problem by loading it more early, but having it in the kernel removes the keyboard problem completely. Bye, Alexander. -- A lost ounce of gold may be found, a lost moment of time never. http://www.Leidinger.netAlexander @ Leidinger.net: PGP ID = B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"