Re: SMART

2009-11-12 Thread Thomas Backman
On Nov 12, 2009, at 1:25 PM, Ivan Voras wrote:
> Actually, it would be good if you taught more than him :)
> 
> I've always wondered how important are each of the dozen or so statistics and 
> what indicates what...
> 
> Here is for example my desktop drive:
> 
> SMART Attributes Data Structure revision number: 10
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE UPDATED  
> WHEN_FAILED RAW_VALUE
>  1 Raw_Read_Error_Rate 0x000f   087   083   006Pre-fail Always   
> -   45398197
>  3 Spin_Up_Time0x0003   096   093   000Pre-fail Always   
> -   0
>  4 Start_Stop_Count0x0032   100   100   020Old_age Always   - 
>   64
>  5 Reallocated_Sector_Ct   0x0033   100   100   036Pre-fail Always   
> -   0
>  7 Seek_Error_Rate 0x000f   084   060   030Pre-fail Always   
> -   247407473
>  9 Power_On_Hours  0x0032   089   089   000Old_age Always   - 
>   10155
> 10 Spin_Retry_Count0x0013   100   100   097Pre-fail Always   
> -   0
> 12 Power_Cycle_Count   0x0032   100   100   020Old_age Always   - 
>   64
> 187 Reported_Uncorrect  0x0032   100   100   000Old_age   Always  
>  -   0
> 189 High_Fly_Writes 0x003a   100   100   000Old_age   Always  
>  -   0
> 190 Airflow_Temperature_Cel 0x0022   058   055   045Old_age   Always  
>  -   42 (Lifetime Min/Max 37/44)
> 194 Temperature_Celsius 0x0022   042   045   000Old_age   Always  
>  -   42 (0 20 0 0)
> 195 Hardware_ECC_Recovered  0x001a   062   059   000Old_age   Always  
>  -   45398197
> 197 Current_Pending_Sector  0x0012   100   100   000Old_age   Always  
>  -   0
> 198 Offline_Uncorrectable   0x0010   100   100   000Old_age Offline  
> -   0
> 199 UDMA_CRC_Error_Count0x003e   200   200   000Old_age   Always  
>  -   0
> 200 Multi_Zone_Error_Rate   0x   100   253   000Old_age Offline  
> -   0
> 202 TA_Increase_Count   0x0032   100   253   000Old_age   Always  
>  -   0
> 
> I see many values exceeding threshold but since I see it so often on other 
> drives I don't know what the threshold is for.
None of the your values are exceeding the threshold - it works backwards. If 
the value is LOWER than the threshold, you might be in trouble.
Also, judging by the raw read error rate, seek error rate and hardward ECC 
recovered, allow me to guess that this is a Seagate drive. :-)
(Seagate drives, perhaps among others, use these raw values way differently 
than others. My Hitachi 7K1000.B has 0 on those.)

Regards,
Thomas___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 8.0-RC USB/FS problem

2009-11-26 Thread Thomas Backman
On Nov 26, 2009, at 8:25 PM, Guojun Jin wrote:

> Shall I fill a defect? or someone on this mailing list can take care of this 
> problem before release.
> 
> -Jin
8.0 is halfway out already, you can download -RELEASE ISOs or upgrade using 
freebsd-update. The main announcement just hasn't been made yet.

Regards,
Thomas___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Phoronix Benchmarks: Waht's wrong with FreeBSD 8.0?

2009-11-30 Thread Thomas Backman
On Nov 30, 2009, at 9:47 AM, O. Hartmann wrote:

> I'm just wondering what's wrong with FreeBSD 8.0/amd64 when I read the 
> Benchmarks on Phoronix.org's website. Especially FreeBSD's threaded I/O shows 
> in contrast to all claims that have been to be improoved the opposite.
Corrected link: 
http://www.phoronix.com/scan.php?page=article&item=freebsd8_benchmarks&num=1

And yeah, quite honestly: disk scheduling in FreeBSD appears to suck... The 
only reason I'm not switching from Linux. :(

Regards,
Thomas

(PS. See my thread about horrible console latency during disk IO in the 
archives, very related. DS.)___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Phoronix Benchmarks: Waht's wrong with FreeBSD 8.0?

2009-11-30 Thread Thomas Backman
On Nov 30, 2009, at 12:38 PM, O. Hartmann wrote:

> Thomas Backman wrote:
>> On Nov 30, 2009, at 9:47 AM, O. Hartmann wrote:
>>> I'm just wondering what's wrong with FreeBSD 8.0/amd64 when I read the 
>>> Benchmarks on Phoronix.org's website. Especially FreeBSD's threaded I/O 
>>> shows in contrast to all claims that have been to be improoved the opposite.
>> Corrected link: 
>> http://www.phoronix.com/scan.php?page=article&item=freebsd8_benchmarks&num=1
>> And yeah, quite honestly: disk scheduling in FreeBSD appears to suck... The 
>> only reason I'm not switching from Linux. :(
>> Regards,
>> Thomas
>> (PS. See my thread about horrible console latency during disk IO in the 
>> archives, very related. DS.)
> 
> Hello Thomas.
> I recall myself having had similar problems during heavy disk I/O (UFS and 
> ZFS) with stuck console, stuck clients and especially stuck X11-clients. The 
> discussion was really 'hot', but in the end no clear statement was made 
> whether this is disk-i/o related or a deeper problem in the scheduler.
> 
> Sorry for the lack of the link, I thought Phoronix is well known ...
> 
> Oliver
That's too bad, re: the scheduling. It seems to be a quite universal problem, 
yet I haven't seen much effort spent on working on the problem. :/

Re: phoronix, I commented mostly because the site is .com and not .org, so I 
came to a parked domain when I clicked your link. :)
Also, I figured linking directly to the article will help the archives.

Regards,
Thomas___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Fatal trap 9 triggered by zfs?

2009-12-04 Thread Thomas Backman
On Dec 4, 2009, at 8:56 PM, Stefan Bethke wrote:

> Am 04.12.2009 um 17:52 schrieb Stefan Bethke:
> 
>> I'm getting panics like this every so often (couple weeks, sometimes just a 
>> few days.) A second machine that has identical hardware and is running the 
>> same source has no such problems.
>> 
>> FreeBSD XXX.hanse.de 8.0-STABLE FreeBSD 8.0-STABLE #16: Tue Dec  1 14:30:54 
>> UTC 2009 r...@xxx.hanse.de:/usr/obj/usr/src/sys/EISENBOOT  amd64
>> 
>> # zpool status
>> pool: tank
>> state: ONLINE
>> scrub: none requested
>> config:
>> 
>>  NAMESTATE READ WRITE CKSUM
>>  tankONLINE   0 0 0
>>ad4s1dONLINE   0 0 0
>> # cat /boot/loader.conf
>> vfs.zfs.arc_max="512M"
>> vfs.zfs.prefetch_disable="1"
>> vfs.zfs.zil_disable="1"
> 
> Got another, different one.  Any tuning suggestions or similar?
> 
> 
> #6  0x80586c7a in vm_map_entry_splay (addr=Variable "addr" is not 
> available.
> )
>at /usr/src/sys/vm/vm_map.c:771
> #7  0x80587f37 in vm_map_lookup_entry (map=0xff0001e8, 
>address=18446743523979624448, entry=0xff80625db170)
>at /usr/src/sys/vm/vm_map.c:1021
> #8  0x80588aa3 in vm_map_delete (map=0xff0001e8, 
>start=18446743523979624448, end=18446743523979689984)
>at /usr/src/sys/vm/vm_map.c:2685
> #9  0x80588e61 in vm_map_remove (map=0xff0001e8, 
>start=18446743523979624448, end=18446743523979689984)
>at /usr/src/sys/vm/vm_map.c:2774
> #10 0x8057db85 in uma_large_free (slab=0xff005fcc7000)
>at /usr/src/sys/vm/uma_core.c:3021
> #11 0x80325987 in free (addr=0xff80018b, 
>mtp=0x80ac61e0) at /usr/src/sys/kern/kern_malloc.c:471
> #12 0x80a36d03 in vdev_cache_evict (vc=0xff0001723ce0, 
>ve=0xff003dd52200)
>at 
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_cache.c:151
> #13 0x80a372ad in vdev_cache_read (zio=0xff005f5ca2d0)
>at 
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_cache.c:182
Bad RAM/motherboard? My first thought when I read your first mail (re: 
identical hardware) was bad hardware, and this seems to point towards that too, 
no?
Have you tried memtest86+?

Regards,
Thomas___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: [PATCH] Lockmgr deadlock on STABLE_8

2010-01-14 Thread Thomas Backman

On Jan 14, 2010, at 2:11 PM, Pete French wrote:

>> Also enable INVARIANTS.
> 
> Including INVARIANTS stops my kernel from building. It
> has been this way since 8.0 (this is why I only
> had WITNESS compiled in). It fails with many many
> errors like this:
> 
> /usr/src/sys/vm/vm_map.c:575: undefined reference to `_mtx_assert'
> 
> My kernel config file looks like this:
> 
>   include GENERIC
>   ident   WITNESS
> 
>   options KDB
>   options DDB
>   options WITNESS
>   options INVARIANTS
INVARIANTS requires INVARIANT_SUPPORT [sic] in the kernel config (see comments 
in GENERIC).

Regards,
Thomas___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: HEADS-UP: Shared Library Versions bumped...

2009-07-19 Thread Thomas Backman

On Jul 19, 2009, at 20:16, Ken Smith wrote:

The problem is that as of the next time you update a machine that had
been running -current you are best off reinstalling all ports or other
applications you have on the machine.  When you reboot after doing the
update to the base system everything you have installed will still  
work

because the old shared library versions will still be there.  However
anything you build on the machine after its base system gets updated
would be linked against the newer base system shared libraries but any
libraries that are part of ports or other applications (e.g. the Xorg
libraries) would have been linked against the older library versions.
You really don't want to leave things that way.
So, to be clear: a fresh ports tree and "portupgrade -af" after  
building and installing r195767+ should be enough to solve any  
problems? (installkernel, installworld, reboot, portupgrade -af)


Regards,
Thomas
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zfs on gmirror slice

2009-09-01 Thread Thomas Backman

On Sep 1, 2009, at 6:04 PM, Maciej Jan Broniarz wrote:


2009/9/1 Thomas Backman :

On Sep 1, 2009, at 4:04 PM, Maciej Jan Broniarz wrote:


I'm not familiar with gmirror, but it'd be a way better idea to  
mirror it
using ZFS if possible - that way you get self-healing and stuff  
like that,
which you won't if ZFS doesn't have a mirror/RAIDZ setup, but only  
sees a

single slice.



I would like to do so. I have to disks (ad4 and ad5). Is it possible
to create two slices on both disks (eg ad4s1 and ad4s2 for ad4).
Then to create gmirror on ad4s1, install freebsd on it so it would
boot from it. Then, after having my system running to create zfs
mirror from ad4s2 and ad5s2?
Why not go ZFS all the way? ZFS on root is well supported these days  
(well, not by sysinstall, but it works great if you do it manually) -  
you don't even need UFS /boot anymore.
Again, not familiar with gmirror, but my guess (that will be corrected  
or confirmed by someone ;) is that your way would work too.


Regards,
Thomas
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zfs on gmirror slice

2009-09-01 Thread Thomas Backman

On Sep 1, 2009, at 4:04 PM, Maciej Jan Broniarz wrote:


Hi,

Is is a bad ida to create a zfs pool from a gmirrored slice?
zpool create tank /dev/mirror/gm0s1g works fine, but after the reboot
the filesystem failes consistency check.

Regards,
mjb
I'm not familiar with gmirror, but it'd be a way better idea to mirror  
it using ZFS if possible - that way you get self-healing and stuff  
like that, which you won't if ZFS doesn't have a mirror/RAIDZ setup,  
but only sees a single slice.


Regards,
Thomas
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zfs on gmirror slice

2009-09-02 Thread Thomas Backman

On Sep 2, 2009, at 10:27 AM, Mark Stapper wrote:


Emil Mikulic wrote:

On Wed, Sep 02, 2009 at 09:20:21AM +0200, Mark Stapper wrote:


updating a zfs filesystem which you are running from is next to
impossible.



[citation needed]  :)


Well, to update your zfs filesystem version, the filesystem is first
unmounted, then updated, and then mounted again.
citation coming up!
# umount /
umount: unmount of / failed: Invalid argument
Nothing a LiveCD or something to that regard can't handle. Obviously  
this doesn't work for everyone, but it should for many.





So, i would recommend setting up gmirror to mirror your whole disks,
install the base system(boot and "world") on a small UFS slice,  
and use

the rest of the disc as zfs slice.



As Thomas Backman pointed out, this means you won't get self-healing.

self-healing sounds very nice, but with mirrorring you have data on  
two

discs, so in that case there no "healing" involved, it's just
checksumming and reading the non-corrupted copy.
From the gmirror manpage: "All operations like failure detection,  
stale

component detection, rebuild of stale components, etc. are also done
automatically."
This would indicate the same functionality, with a much less fancy  
name.

However, i have not tested it the way they demonstrate zfs's
"self-healing" property.
I might, if I get the time to run it in a virtual machine one of these
days..
If ZFS finds a corrupted copy and a non-corrupted one in a mirrored  
ZFS pool, it will repair the damage so that both copies are valid, so  
yes, self-healing will indeed occur. :)


Regards,
Thomas
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zfs kernel panic

2009-09-08 Thread Thomas Backman

On Sep 8, 2009, at 5:23 PM, Gerrit Kühn wrote:


Hi folks,

I just upgraded a zfs server from 7.0-something to 7.2-stable and  
hoped to
get rid of some minor instabilities I experienced every 6 months or  
so.
Unfortunately, the new system crashed for the first time after only  
a few

hours when copying some files via scp onto it.
I got a kernel panic which looked quite similar to the one reported  
here

(kmem_map too small):



I have a dual cpu dual core opteron system with 4GB of RAM and a 3- 
disk
raidz1. I took out the memory settings from loader.conf as suggested  
in
UPDATING. I did not yet upgrade zpool nor zfs version (would that  
help?).

Are there any known issues or any further hints what might cause the
crash? I copied the files again, but this time everything went fine.
Hmm. Do you use i386 or amd64? This panic is (was?) pretty common on  
i386 before tuning, but... 4GB RAM and an Opteron should have you  
running amd64, no?


Regards,
Thomas___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Extreme console latency during disk IO (8.0-RC1, previous releases also affected according to others)

2009-10-12 Thread Thomas Backman
I'm copying this over from the freebsd-performance list, as I'm  
looking for a few more opinions - not on the problems *I* am having,  
but rather to check whether the problem is universal or not, and if  
not, find a possible common factor.

In other words: I want to hear about your experiences, *good or bad*!

Here's the original thread (not from the beginning, though): 
http://lists.freebsd.org/pipermail/freebsd-performance/2009-October/003843.html

Long story short, my version: when the disk is stressed hard enough,  
console IO becomes COMPLETELY unbearable. 10+ seconds to switch  
between windows in screen(1), running (or even typing) simple  
commands, etc. This happens both via SSH and the serial console.


How to reproduce/test:
1) time file /etc/* > /dev/null a few times, or something similar that  
uses the disk; write down a common/average/median/whatever time.
2) cat /dev/zero > /uncompressed_fs/filename # please make *sure* it's  
uncompressed, since ZFS with lzjb/gzip enabled will squish this into a  
kilobyte-sized file, thus creating virtually *no* IO.
3) When cat has been running say 10 seconds, re-time command #1 and do  
some interactive stuff - run commands, edit files, etc.


I couldn't actually reproduce the *completely* horrific increase in  
latency I posted about below just now (I did update my sources and  
rebuild, but I'm pretty sure the delta between ~Sep 29 and Oct 6 had  
no major IO changes in 8-STABLE), and the "time file /etc/*" test  
"only" jumped by about 3x (compared to 20-60x+ previously), but it's  
still bad enough: commands such as "ls" and "w" take 2-3 seconds to  
run, as opposed to 0.005s for ls without the added IO... On Linux, the  
increase in latency is closer to 4%. A bit better than, oh, 400  
times. ;)


Oh, and again: this post is not a complaint; this is a post asking for  
your experiences. I know I'm not alone in having these issues - I just  
want to know if there are a lot of people that *don't* too, and what  
could cause them. I can't possibly switch to FreeBSD in production  
with this behaviour - and I've been looking forward to doing so for  
quite a while now.


Regards,
Thomas

PS.

I'll leave my post to the original discussion below. (I don't usually  
top post, but I don't consider this a reply, more of a new post with  
an addition below.)


On Oct 5, 2009, at 10:45 AM, Thomas Backman wrote:


Hey everyone,
I'm having serious trouble with the same thing, and just found this  
thread while looking for the correct place to post. Looks like I  
found it. (I wrote most of the post before finding the thread, so  
some of it will seem a bit odd.)


I run 8.0-RC1/amd64 with ZFS on an A64 3200+ with 2GB RAM and an old  
80GB 7200rpm disk.


My problem is that I get completely unacceptable latency on console  
IO (both via SSH and serial console) when the system is performing  
disk IO. The worst case I've noticed yet was when I tried copying a  
core dump from a lzjb compressed ZFS file system to a gzip-9  
compressed one, to compare the file size/compression ratio. screen 
(1) took at LEAST ten seconds - probably a bit more - I'm not  
exaggerating here - to switch to another window, and an "ls" in an  
empty directory also about 5-10 seconds.
Doing some silly CPU load with two instances of "yes >/dev/null" (on  
a single core, remember) doesn't change anything, the system remains  
very responsive. "cat /dev/zero > /uncompressed_fs/..." however  
produces the extreme slowdown. (On a gzip-1 FS it doesn't, since the  
file ends up extremely small - a kilobyte or so - even after a  
while, thus performing minimal IO).


I'm thinking about switching to FreeBSD on my beefier "production"  
system (dual-core amd64, 4GB RAM, 4x1TB disks, compared to this one,  
single-core, 2GB RAM, 80GB disk), but unless I feel assured this  
won't happen there as well, I'm not so sure anymore. I can do any  
kind of heavy IO/compilation/whatever on that box, currently running  
Linux, and it's always unnoticable. In this case it's impossible  
*not* to notice that your key input is lagging behind 5-10  
seconds... I thought multiple times that the box must have panicked.
I do realize that the hardware isn't the best, especially the disks,  
but this is far worse than it should be!


Here's some of the testing done in this thread (or at least  
something like it):


[r...@chaos ~]# time file /etc/* >/dev/null
real0m1.725s
user0m0.993s
sys 0m0.021s
[r...@chaos ~]# time file /etc/* >/dev/null

real0m1.008s
user0m0.990s
sys 0m0.015s
[r...@chaos ~]# time file /etc/* >/dev/null

real0m1.008s
user0m0.967s
sys 0m0.038s
[r...@chaos ~]# time file /etc/* >/dev/null

real0m1.015s
user0m0.998s
sys  

Re: Extreme console latency during disk IO (8.0-RC1, previous releases also affected according to others)

2009-10-13 Thread Thomas Backman


On Oct 13, 2009, at 12:35 AM, Luigi Rizzo wrote:


On Mon, Oct 12, 2009 at 09:48:42PM +0200, Thomas Backman wrote:

I'm copying this over from the freebsd-performance list, as I'm
looking for a few more opinions - not on the problems *I* am having,
but rather to check whether the problem is universal or not, and if
not, find a possible common factor.
In other words: I want to hear about your experiences, *good or bad*!

Here's the original thread (not from the beginning, though):
http://lists.freebsd.org/pipermail/freebsd-performance/2009-October/003843.html

Long story short, my version: when the disk is stressed hard enough,
console IO becomes COMPLETELY unbearable. 10+ seconds to switch
between windows in screen(1), running (or even typing) simple
commands, etc. This happens both via SSH and the serial console.


hi,
this issue (not specific to FreeBSD, and not new -- it has been
like this forever) is discussed in some detail here

http://www.bsdcan.org/2009/schedule/events/122.en.html

The following code (a bit outdated) can help

http://lists.freebsd.org/pipermail/freebsd-stable/2009-March/048704.html

cheers
luigi
Hmm, how stable would you say the code is? (And/or has there been any  
progress since March?)
I'd prefer something that I feel confident in using in production, and  
the warning in the README clearly says "stay away!".


Regards,
Thomas
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"