Re: zfs arc and amount of wired memory

2012-02-08 Thread Eugene M. Zheganin

Hi.

On 09.02.2012 02:29, Andriy Gapon wrote:

on 08/02/2012 12:31 Eugene M. Zheganin said the following:

Hi.

On 08.02.2012 02:17, Andriy Gapon wrote:

[output snipped]

Thank you.  I don't see anything suspicious/unusual there.
Just case, do you have ZFS dedup enabled by a chance?

I think that examination of vmstat -m and vmstat -z outputs may provide some
clues as to what got all that memory wired.


Nope, I don't have deduplication feature enabled.

OK.  So, did you have a chance to inspect vmstat -m and vmstat -z?


I did. I didn't understand it, but kinda 'felt the atmosphere'. It was 
pretty much similar to the output I supplied below. Most of the sizes 
were used by 'solaris' and numerous 'zio' caches.




It could be very well possible that swap on zvol doesn't work well when the
kernel itself is starved on memory.


So I want to ask - how to report it and what should I include in such pr ?

I am leaving swap-on-zvol issue aside.  Your original problem doesn't seem to be
ZFS-related.  I suspect that you might be running into some kernel memory leak.
  If you manage to reproduce the high wired value again, then vmstat -m and
vmstat -z may provide some useful information.

In this vein, do you use any out-of-tree kernel modules?
Also, can you try to monitor your system to see when wired count grows?


Nope, I don't have any 3rd party kernel modules.
Yes, I can monitor it, but I have no idea what should I exactly monitor. 
This system is running squid with a dozens of authentication helpers, 
freeradius + postgresql, sendmail and a perl squid log parser, which 
uses postgresql too. net/isc-dhcp, quagga, net/mpd5, a bunch of sendmail 
milters, net/samba35, bind. So it's some kind of a corporate production 
zoo. As I write this letter, the wired amount of memory increases by 70 
megs. Excuse me, 80 megs now.


The output I promised (if it's MORE acceptable in the form of a link to 
a paste site, just say it):


[emz@taiga:etc/snmp]# vmstat -m
 Type InUse MemUse HighUse Requests  Size(s)
hhook 2 1K   -2  128
  ithread8514K   -   85  32,128,256
   KTRACE   10013K   -  100  128
   linker   280   226K   -  384  
16,32,64,128,256,512,1024,2048,4096

lockf9410K   - 20264872  64,128
   loginclass 3 1K   -  367  64
 pci_link13 2K   -   13  16,128
   ip6ndp55 5K   -   78  64,128
   ip6opt23 6K   -   142134  32,256
 temp   14620K   -   114199  
16,32,64,128,256,512,1024,2048,4096
   devbuf 28285 56235K   -29225  
16,32,64,128,256,512,1024,2048,4096

   module   29137K   -  291  128
   USBdev3910K   -   39  64,128,512,1024
 mtx_pool 216K   -2
  USB55   166K   -   58  16,32,64,128,256,512,2048,4096
  osd22 1K   -10870  16,64
  ddb_capture 148K   -1
  subproc   831  1312K   -56233  512,4096
 proc 216K   -2
  session66 9K   -16431  128
 pgrp7310K   -16581  128
 cred   650   102K   -   818736  64,256
  uidinfo15 4K   - 5420  128,2048
   plimit25 7K   - 4948  256
   kbdmux 818K   -8  16,512,1024,2048
sysctltmp 0 0K   -  9741241  16,32,64,128,4096
sysctloid  4837   243K   - 4950  16,32,64,128
   sysctl 0 0K   -50230  16,32,64
  tidhash 116K   -1
  callout 3  1536K   -3
 umtx  2712   339K   - 2766  128
 p1003.1b 1 1K   -1  16
 SWAP 2  1097K   -2  64
   bus-sc84   686K   - 2193  
16,32,64,128,256,512,1024,2048,4096

  bus   86178K   - 4641  16,32,64,128,256,512,1024
  devstat 4 9K   -4  32,4096
 eventhandler83 7K   -   83  64,128
 kobj   194   776K   -  231  4096
  Per-cpu 1 1K   -1  32
   aacbuf   24172K   -  273  64,128,512
 rman   21923K   -  449  16,32,128
 acpiintr 1 1K   -1  64
 sbuf 1 1K   -  967  
16,32,64,128,256,512,1024,2048,4096
   acpica  1641   174K   -50289  
16,32,64,128,256,512,1024,2048,4096

   DEVFS1   10653K   -  111  512
   DEVFS3   26166K   -  269  256
stack 0 0K   -2  256
taskqueue85 8K   -  121  16,32,64,128,1024
   Unitno21 1K   -   208557  32,64
   DEVFS2   106 2K   -  108  16
   DEVFS_RULE5426K   -   54  64,512
DEVFS39 1K   -   40  16,128
  Witness 1   128K   -1
  iov 0 

Re: zfs arc and amount of wired memory

2012-02-08 Thread Gary Palmer
On Wed, Feb 08, 2012 at 11:18:02PM +0200, Andriy Gapon wrote:
> on 08/02/2012 22:50 Jeremy Chadwick said the following:
> > Politely -- recommending this to a user is a good choice of action, but
> > the problem is that no user, even an experienced user, is going to know
> > what all of the "Types" (vmstat -m) or "ITEMs" (vmstat -z) correlate
> > with on the system.
> 
> I see no problem with users sharing the output and asking for help 
> interpreting
> it.  I do not know of any easier way to analyze problems like this one.

Also, since we are looking for gigs of memory it should be relatively easy
to look down the 'Size' or 'MemUse' columns and identify likely candidates
for "eating gobs of memory".  The user doesn't need to know what the rest
of the data means, and can ask what that line means and how to fix it

Gary
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zfs arc and amount of wired memory

2012-02-08 Thread Charles Sprickman

On Feb 8, 2012, at 7:43 PM, Artem Belevich wrote:

> On Wed, Feb 8, 2012 at 4:28 PM, Jeremy Chadwick
>  wrote:
>> On Thu, Feb 09, 2012 at 01:11:36AM +0100, Miroslav Lachman wrote:
> ...
>>> ARC Size:
>>>  Current Size: 1769 MB (arcsize)
>>>  Target Size (Adaptive):   512 MB (c)
>>>  Min Size (Hard Limit):512 MB (zfs_arc_min)
>>>  Max Size (Hard Limit):3584 MB (zfs_arc_max)
>>> 
>>> The target size is going down to the min size and after few more
>>> days, the system is so slow, that I must reboot the machine. Then it
>>> is running fine for about 107 days and then it all repeat again.
>>> 
>>> You can see more on MRTG graphs
>>> http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/
>>> You can see links to other useful informations on top of the page
>>> (arc_summary, top, dmesg, fs usage, loader.conf)
>>> 
>>> There you can see nightly backups (higher CPU load started at
>>> 01:13), otherwise the machine is idle.
>>> 
>>> It coresponds with ARC target size lowering in last 5 days
>>> http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_arcstats_size.html
>>> 
>>> And with ARC metadata cache overflowing the limit in last 5 days
>>> http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_vfs_meta.html
>>> 
>>> I don't know what's going on and I don't know if it is something
>>> know / fixed in newer releases. We are running a few more ZFS
>>> systems on 8.2 without this issue. But those systems are in
>>> different roles.
>> 
>> This sounds like the... damn, what is it called... some kind of internal
>> "counter" or "ticks" thing within the ZFS code that was discovered to
>> only begin happening after a certain period of time (which correlated to
>> some number of days, possibly 107).  I'm sorry that I can't be more
>> specific, but it's been discussed heavily on the lists in the past, and
>> fixes for all of that were committed to RELENG_8.  I wish I could
>> remember the name of the function or macro or variable name it pertained
>> to, something like LTHAW or TLOCK or something like that.  I would say
>> "I don't know why I can't remember", but I do know why I can't remember:
>> because I gave up trying to track all of these problems.
>> 
>> Does someone else remember this issue?  CC'ing Martin who might remember
>> for certain.
> 
> It's LBOLT. :-)
> 
> And there was more than one related integer overflow. One of them
> manifested itself as L2ARC feeding thread hogging CPU time after about
> a month of uptime. Another one caused issue with ARC reclaim after 107
> days. See more details in this thread:
> 
> http://lists.freebsd.org/pipermail/freebsd-fs/2011-May/011584.html

This would be an excellent piece of information to have on one of the ZFS
wiki pages.  The 107 day issue exists post-8.2, correct?  Anyone on this 
cc: list have permissions to edit those pages?

Thanks,

Charles

> 
> --Artem
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zfs arc and amount of wired memory

2012-02-08 Thread Charles Sprickman

On Feb 8, 2012, at 7:11 PM, Miroslav Lachman wrote:

> Andriy Gapon wrote:
>> on 08/02/2012 12:31 Eugene M. Zheganin said the following:
>>> Hi.
>>> 
>>> On 08.02.2012 02:17, Andriy Gapon wrote:
 [output snipped]
 
 Thank you.  I don't see anything suspicious/unusual there.
 Just case, do you have ZFS dedup enabled by a chance?
 
 I think that examination of vmstat -m and vmstat -z outputs may provide 
 some
 clues as to what got all that memory wired.
 
>>> Nope, I don't have deduplication feature enabled.
>> 
>> OK.  So, did you have a chance to inspect vmstat -m and vmstat -z?
>> 
>>> By the way, today, after eating another 100M of wired memory this server 
>>> hanged
>>> out with multiple non-stopping messages
>>> 
>>> swap_pager: indefinite wait buffer
>>> 
>>> Since it's swapping on zvol, it looks to me like it could be the mentioned 
>>> in
>>> another thread here ("Swap on zvol - recommendable?") resource starvation 
>>> issue;
>>> may be it happens faster when the ARC isn't limited.
>> 
>> It could be very well possible that swap on zvol doesn't work well when the
>> kernel itself is starved on memory.
>> 
>>> So I want to ask - how to report it and what should I include in such pr ?
>> 
>> I am leaving swap-on-zvol issue aside.  Your original problem doesn't seem 
>> to be
>> ZFS-related.  I suspect that you might be running into some kernel memory 
>> leak.
>>  If you manage to reproduce the high wired value again, then vmstat -m and
>> vmstat -z may provide some useful information.
>> 
>> In this vein, do you use any out-of-tree kernel modules?
>> Also, can you try to monitor your system to see when wired count grows?
> 
> I am seeing something similar on one of our machine. This is old 7.3 with ZFS 
> v13, that's why I did not reported it.
> 
> The machine is used as storage for backups made by rsync. All is running fine 
> for about 107 days. Then backups are slower and slower because of some 
> strange memory situation.
> 
> Mem: 15M Active, 17M Inact, 3620M Wired, 420K Cache, 48M Buf, 1166M Free
> 
> ARC Size:
> Current Size: 1769 MB (arcsize)
> Target Size (Adaptive):   512 MB (c)
> Min Size (Hard Limit):512 MB (zfs_arc_min)
> Max Size (Hard Limit):3584 MB (zfs_arc_max)
> 
> The target size is going down to the min size and after few more days, the 
> system is so slow, that I must reboot the machine. Then it is running fine 
> for about 107 days and then it all repeat again.
> 
> You can see more on MRTG graphs
> http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/
> You can see links to other useful informations on top of the page 
> (arc_summary, top, dmesg, fs usage, loader.conf)
> 
> There you can see nightly backups (higher CPU load started at 01:13), 
> otherwise the machine is idle.
> 
> It coresponds with ARC target size lowering in last 5 days
> http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_arcstats_size.html
> 
> And with ARC metadata cache overflowing the limit in last 5 days
> http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_vfs_meta.html

I'm not having luck finding it, but there's some known issue that exists even 
in 8.2 where some 32-bit counter overflows or something. I don't truly remember 
the logic in it, but when you hit it, it's around 110 days or so.  Before it 
gets really bad (to the point where you either reboot or get some memory 
exhaustion panic), you can see zfs "evict skips" incrementing rapidly.  Looking 
at that graph, that would be my guess as to what's happening to you.  It's easy 
to check - run one of the arc stats scripts, look for "evict_skips", note the 
number and then run it a few minutes later.  If it increases by more than a few 
hundred, you've hit the bug.  You'll find at that point the kernel is no longer 
"evicting" ARC from the kernel and it will just continue to grow until bad 
things happen.

Charles

> 
> I don't know what's going on and I don't know if it is something know / fixed 
> in newer releases. We are running a few more ZFS systems on 8.2 without this 
> issue. But those systems are in different roles.
> 
> Miroslav Lachman
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zfs arc and amount of wired memory

2012-02-08 Thread Miroslav Lachman

Artem Belevich wrote:

On Wed, Feb 8, 2012 at 4:28 PM, Jeremy Chadwick
  wrote:

On Thu, Feb 09, 2012 at 01:11:36AM +0100, Miroslav Lachman wrote:

...

ARC Size:
  Current Size: 1769 MB (arcsize)
  Target Size (Adaptive):   512 MB (c)
  Min Size (Hard Limit):512 MB (zfs_arc_min)
  Max Size (Hard Limit):3584 MB (zfs_arc_max)

The target size is going down to the min size and after few more
days, the system is so slow, that I must reboot the machine. Then it
is running fine for about 107 days and then it all repeat again.

You can see more on MRTG graphs
http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/
You can see links to other useful informations on top of the page
(arc_summary, top, dmesg, fs usage, loader.conf)

There you can see nightly backups (higher CPU load started at
01:13), otherwise the machine is idle.

It coresponds with ARC target size lowering in last 5 days
http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_arcstats_size.html

And with ARC metadata cache overflowing the limit in last 5 days
http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_vfs_meta.html

I don't know what's going on and I don't know if it is something
know / fixed in newer releases. We are running a few more ZFS
systems on 8.2 without this issue. But those systems are in
different roles.


This sounds like the... damn, what is it called... some kind of internal
"counter" or "ticks" thing within the ZFS code that was discovered to
only begin happening after a certain period of time (which correlated to
some number of days, possibly 107).  I'm sorry that I can't be more
specific, but it's been discussed heavily on the lists in the past, and
fixes for all of that were committed to RELENG_8.


Thank you for your quick response. I am glad that it is fixed in 8.x. So 
I will upgrade this last old machine in few weeks. :)



 I wish I could
remember the name of the function or macro or variable name it pertained
to, something like LTHAW or TLOCK or something like that.  I would say
"I don't know why I can't remember", but I do know why I can't remember:
because I gave up trying to track all of these problems.

Does someone else remember this issue?  CC'ing Martin who might remember
for certain.


It's LBOLT. :-)

And there was more than one related integer overflow. One of them
manifested itself as L2ARC feeding thread hogging CPU time after about
a month of uptime. Another one caused issue with ARC reclaim after 107
days. See more details in this thread:

http://lists.freebsd.org/pipermail/freebsd-fs/2011-May/011584.html


Yes, it is exactly this problem. Thank you for the link to this thread. 
I am subscribed to freebsd-fs@ and I am reading it almost daily, but I 
missed this one!


Thanks to both of you!

Miroslav Lachman
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: siisch1: Error while READ LOG EXT

2012-02-08 Thread Mike Tancsa
On 2/8/2012 4:27 PM, Jeremy Chadwick wrote:
> 
> This indicates the controller on channel 1 (siisch1) is "stalled"
> waiting for underlying communication with the device attached to it.

Hi,
But which device ? the PM itself, or the disks behind it ? And which 
disk ?
> 
> 
> This is almost certainly a lower level problem with the disk that cannot
> be addressed/solved via normal means.  Thus, my recommendation is to
> replace the disk.

I would gladly replace it if I knew which one :)

> 
> Regarding the repeated errors at semi-regular (but not entirely)
> intervals: are you using smartd?  Do you have a cronjob that issues
> smartctl -a or smartctl -x commands at intervals?  I imagine any of
> these could be tickling something lower level.

Dont have smartd running. The box takes a lot of backups as well as a constant 
stream of netflow data, so a lot of writes to it.

> 
> Also, please upgrade your smartmontools to 5.42.  It does provide some
> further enhancements that are useful.
> 

Done.


# smartctl -x /dev/ada9
smartctl 5.42 2011-10-20 r3458 [FreeBSD 8.2-STABLE amd64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.11
Device Model: ST31000333AS
Serial Number:9TE14SRV
LU WWN Device Id: 5 000c50 010a39664
Firmware Version: SD35
User Capacity:1,000,204,886,016 bytes [1.00 TB]
Sector Size:  512 bytes logical/physical
Device is:In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 4
Local Time is:Wed Feb  8 20:00:47 2012 EST

==> WARNING: There are known problems with these drives,
see the following Seagate web pages:
http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207931
http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207951
http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207957

SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status:  (   0) The previous self-test routine completed
without error or no self-test has ever 
been run.
Total time to complete Offline 
data collection:(  617) seconds.
Offline data collection
capabilities:(0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off 
support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:(0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:(0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine 
recommended polling time:(   1) minutes.
Extended self-test routine
recommended polling time:( 203) minutes.
Conveyance self-test routine
recommended polling time:(   2) minutes.
SCT capabilities:  (0x103b) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAGSVALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate POSR--   112   099   006-44490692
  3 Spin_Up_TimePO   093   092   000-0
  4 Start_Stop_Count-O--CK   100   100   020-68
  5 Reallocated_Sector_Ct   PO--CK   100   100   036-2
  7 Seek_Error_Rate POSR--   088   060   030-791764702
  9 Power_On_Hours  -O--CK   075   075   000-22759
 10 Spin_Retry_CountPO--C-   100   100   097-2
 12 Power_Cycle_Count   -O--CK   100   100   020-68
184 End-to-End_Error-O--CK   100   100   099-0
187 Reported_Uncorrect  -O--CK   095   095   000-5
188 Command_Timeout -O--CK   100   10

Re: zfs arc and amount of wired memory

2012-02-08 Thread Artem Belevich
On Wed, Feb 8, 2012 at 4:28 PM, Jeremy Chadwick
 wrote:
> On Thu, Feb 09, 2012 at 01:11:36AM +0100, Miroslav Lachman wrote:
...
>> ARC Size:
>>          Current Size:             1769 MB (arcsize)
>>          Target Size (Adaptive):   512 MB (c)
>>          Min Size (Hard Limit):    512 MB (zfs_arc_min)
>>          Max Size (Hard Limit):    3584 MB (zfs_arc_max)
>>
>> The target size is going down to the min size and after few more
>> days, the system is so slow, that I must reboot the machine. Then it
>> is running fine for about 107 days and then it all repeat again.
>>
>> You can see more on MRTG graphs
>> http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/
>> You can see links to other useful informations on top of the page
>> (arc_summary, top, dmesg, fs usage, loader.conf)
>>
>> There you can see nightly backups (higher CPU load started at
>> 01:13), otherwise the machine is idle.
>>
>> It coresponds with ARC target size lowering in last 5 days
>> http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_arcstats_size.html
>>
>> And with ARC metadata cache overflowing the limit in last 5 days
>> http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_vfs_meta.html
>>
>> I don't know what's going on and I don't know if it is something
>> know / fixed in newer releases. We are running a few more ZFS
>> systems on 8.2 without this issue. But those systems are in
>> different roles.
>
> This sounds like the... damn, what is it called... some kind of internal
> "counter" or "ticks" thing within the ZFS code that was discovered to
> only begin happening after a certain period of time (which correlated to
> some number of days, possibly 107).  I'm sorry that I can't be more
> specific, but it's been discussed heavily on the lists in the past, and
> fixes for all of that were committed to RELENG_8.  I wish I could
> remember the name of the function or macro or variable name it pertained
> to, something like LTHAW or TLOCK or something like that.  I would say
> "I don't know why I can't remember", but I do know why I can't remember:
> because I gave up trying to track all of these problems.
>
> Does someone else remember this issue?  CC'ing Martin who might remember
> for certain.

It's LBOLT. :-)

And there was more than one related integer overflow. One of them
manifested itself as L2ARC feeding thread hogging CPU time after about
a month of uptime. Another one caused issue with ARC reclaim after 107
days. See more details in this thread:

http://lists.freebsd.org/pipermail/freebsd-fs/2011-May/011584.html

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zfs arc and amount of wired memory

2012-02-08 Thread Jeremy Chadwick
On Thu, Feb 09, 2012 at 01:11:36AM +0100, Miroslav Lachman wrote:
> Andriy Gapon wrote:
> >on 08/02/2012 12:31 Eugene M. Zheganin said the following:
> >>Hi.
> >>
> >>On 08.02.2012 02:17, Andriy Gapon wrote:
> >>>[output snipped]
> >>>
> >>>Thank you.  I don't see anything suspicious/unusual there.
> >>>Just case, do you have ZFS dedup enabled by a chance?
> >>>
> >>>I think that examination of vmstat -m and vmstat -z outputs may provide 
> >>>some
> >>>clues as to what got all that memory wired.
> >>>
> >>Nope, I don't have deduplication feature enabled.
> >
> >OK.  So, did you have a chance to inspect vmstat -m and vmstat -z?
> >
> >>By the way, today, after eating another 100M of wired memory this server 
> >>hanged
> >>out with multiple non-stopping messages
> >>
> >>swap_pager: indefinite wait buffer
> >>
> >>Since it's swapping on zvol, it looks to me like it could be the mentioned 
> >>in
> >>another thread here ("Swap on zvol - recommendable?") resource starvation 
> >>issue;
> >>may be it happens faster when the ARC isn't limited.
> >
> >It could be very well possible that swap on zvol doesn't work well when the
> >kernel itself is starved on memory.
> >
> >>So I want to ask - how to report it and what should I include in such pr ?
> >
> >I am leaving swap-on-zvol issue aside.  Your original problem doesn't seem 
> >to be
> >ZFS-related.  I suspect that you might be running into some kernel memory 
> >leak.
> >  If you manage to reproduce the high wired value again, then vmstat -m and
> >vmstat -z may provide some useful information.
> >
> >In this vein, do you use any out-of-tree kernel modules?
> >Also, can you try to monitor your system to see when wired count grows?
> 
> I am seeing something similar on one of our machine. This is old 7.3
> with ZFS v13, that's why I did not reported it.
> 
> The machine is used as storage for backups made by rsync. All is
> running fine for about 107 days. Then backups are slower and slower
> because of some strange memory situation.
> 
> Mem: 15M Active, 17M Inact, 3620M Wired, 420K Cache, 48M Buf, 1166M Free
> 
> ARC Size:
>  Current Size: 1769 MB (arcsize)
>  Target Size (Adaptive):   512 MB (c)
>  Min Size (Hard Limit):512 MB (zfs_arc_min)
>  Max Size (Hard Limit):3584 MB (zfs_arc_max)
> 
> The target size is going down to the min size and after few more
> days, the system is so slow, that I must reboot the machine. Then it
> is running fine for about 107 days and then it all repeat again.
> 
> You can see more on MRTG graphs
> http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/
> You can see links to other useful informations on top of the page
> (arc_summary, top, dmesg, fs usage, loader.conf)
> 
> There you can see nightly backups (higher CPU load started at
> 01:13), otherwise the machine is idle.
> 
> It coresponds with ARC target size lowering in last 5 days
> http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_arcstats_size.html
> 
> And with ARC metadata cache overflowing the limit in last 5 days
> http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_vfs_meta.html
> 
> I don't know what's going on and I don't know if it is something
> know / fixed in newer releases. We are running a few more ZFS
> systems on 8.2 without this issue. But those systems are in
> different roles.

This sounds like the... damn, what is it called... some kind of internal
"counter" or "ticks" thing within the ZFS code that was discovered to
only begin happening after a certain period of time (which correlated to
some number of days, possibly 107).  I'm sorry that I can't be more
specific, but it's been discussed heavily on the lists in the past, and
fixes for all of that were committed to RELENG_8.  I wish I could
remember the name of the function or macro or variable name it pertained
to, something like LTHAW or TLOCK or something like that.  I would say
"I don't know why I can't remember", but I do know why I can't remember:
because I gave up trying to track all of these problems.

Does someone else remember this issue?  CC'ing Martin who might remember
for certain.

-- 
| Jeremy Chadwick j...@parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, US |
| Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zfs arc and amount of wired memory

2012-02-08 Thread Miroslav Lachman

Andriy Gapon wrote:

on 08/02/2012 12:31 Eugene M. Zheganin said the following:

Hi.

On 08.02.2012 02:17, Andriy Gapon wrote:

[output snipped]

Thank you.  I don't see anything suspicious/unusual there.
Just case, do you have ZFS dedup enabled by a chance?

I think that examination of vmstat -m and vmstat -z outputs may provide some
clues as to what got all that memory wired.


Nope, I don't have deduplication feature enabled.


OK.  So, did you have a chance to inspect vmstat -m and vmstat -z?


By the way, today, after eating another 100M of wired memory this server hanged
out with multiple non-stopping messages

swap_pager: indefinite wait buffer

Since it's swapping on zvol, it looks to me like it could be the mentioned in
another thread here ("Swap on zvol - recommendable?") resource starvation issue;
may be it happens faster when the ARC isn't limited.


It could be very well possible that swap on zvol doesn't work well when the
kernel itself is starved on memory.


So I want to ask - how to report it and what should I include in such pr ?


I am leaving swap-on-zvol issue aside.  Your original problem doesn't seem to be
ZFS-related.  I suspect that you might be running into some kernel memory leak.
  If you manage to reproduce the high wired value again, then vmstat -m and
vmstat -z may provide some useful information.

In this vein, do you use any out-of-tree kernel modules?
Also, can you try to monitor your system to see when wired count grows?


I am seeing something similar on one of our machine. This is old 7.3 
with ZFS v13, that's why I did not reported it.


The machine is used as storage for backups made by rsync. All is running 
fine for about 107 days. Then backups are slower and slower because of 
some strange memory situation.


Mem: 15M Active, 17M Inact, 3620M Wired, 420K Cache, 48M Buf, 1166M Free

ARC Size:
 Current Size: 1769 MB (arcsize)
 Target Size (Adaptive):   512 MB (c)
 Min Size (Hard Limit):512 MB (zfs_arc_min)
 Max Size (Hard Limit):3584 MB (zfs_arc_max)

The target size is going down to the min size and after few more days, 
the system is so slow, that I must reboot the machine. Then it is 
running fine for about 107 days and then it all repeat again.


You can see more on MRTG graphs
http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/
You can see links to other useful informations on top of the page 
(arc_summary, top, dmesg, fs usage, loader.conf)


There you can see nightly backups (higher CPU load started at 01:13), 
otherwise the machine is idle.


It coresponds with ARC target size lowering in last 5 days
http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_arcstats_size.html

And with ARC metadata cache overflowing the limit in last 5 days
http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_vfs_meta.html

I don't know what's going on and I don't know if it is something know / 
fixed in newer releases. We are running a few more ZFS systems on 8.2 
without this issue. But those systems are in different roles.


Miroslav Lachman
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: siisch1: Error while READ LOG EXT

2012-02-08 Thread Alexander Motin

On 09.02.2012 00:38, Jeremy Chadwick wrote:

On Thu, Feb 09, 2012 at 12:22:40AM +0200, Alexander Motin wrote:

On 08.02.2012 23:27, Jeremy Chadwick wrote:

On Wed, Feb 08, 2012 at 04:00:57PM -0500, Mike Tancsa wrote:

I have a 4 port eSata PCIe card with 3 external port multipliers attached on an 
AMD64 box (8G of RAM), RELENG8 from Feb1st.

siis0@pci0:5:0:0:   class=0x010400 card=0x71241095 chip=0x31241095 rev=0x02 
hdr=0x00
 vendor = 'Silicon Image Inc (Was: CMD Technology Inc)'
 device = 'PCI-X to Serial ATA Controller (SiI 3124)'
 class  = mass storage
 subclass   = RAID
 bar   [10] = type Memory, range 64, base 0xb4408000, size 128, enabled
 bar   [18] = type Memory, range 64, base 0xb440, size 32768, enabled
 bar   [20] = type I/O Port, range 32, base 0x3000, size 16, enabled
 cap 01[64] = powerspec 2  supports D0 D1 D2 D3  current D0
 cap 07[40] = PCI-X 64-bit supports 133MHz, 2048 burst read, 12 split 
transactions
 cap 05[54] = MSI supports 1 message, 64 bit enabled with 1 message

siis0:   port 0x3000-0x300f mem 
0xb4408000-0xb440807f,0xb440-0xb4407fff irq 19 at device 0.0 on pci5
siis0: [ITHREAD]
siisch0:   at channel 0 on siis0
siisch0: [ITHREAD]
siisch1:   at channel 1 on siis0
siisch1: [ITHREAD]
siisch2:   at channel 2 on siis0
siisch2: [ITHREAD]
siisch3:   at channel 3 on siis0
siisch3: [ITHREAD]

# camcontrol devlist
 at scbus0 target 0 lun 0 (pass0,ada0)
 at scbus0 target 1 lun 0 (pass1,ada1)
 at scbus0 target 2 lun 0 (pass2,ada2)
 at scbus0 target 3 lun 0 (pass3,ada3)
  at scbus0 target 15 lun 0 (pass4,pmp1)
 at scbus1 target 0 lun 0 (pass5,ada4)
 at scbus1 target 1 lun 0 (pass6,ada5)
 at scbus1 target 2 lun 0 (pass7,ada6)
 at scbus1 target 3 lun 0 (pass8,ada7)
 at scbus1 target 4 lun 0 (pass9,ada8)
  at scbus1 target 15 lun 0 (pass10,pmp0)
  at scbus4 target 0 lun 0 (pass11,da0)
 at scbus4 target 0 lun 1 (pass12,da1)
 at scbus4 target 16 lun 0 (pass13)
  at scbus5 target 0 lun 0 (pass14,da2)
  at scbus6 target 0 lun 0 (pass15,ada9)
  at scbus7 target 0 lun 0 (pass16,ada10)
  at scbus8 target 0 lun 0 (pass17,ada11)
 at scbus11 target 0 lun 0 (pass18,ada12)


Ever since I added a new PM, I have been seeing a new error (READ LOG EXT) 
along with a the odd slot timeout error.


Feb  7 23:49:32 backup3 kernel: siisch1:  ... waiting for slots 4700
Feb  7 23:49:32 backup3 kernel: siisch1: Timeout on slot 26
Feb  7 23:49:32 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068
Feb  7 23:49:32 backup3 kernel: siisch1:  ... waiting for slots 4300
Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 30
Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068
Feb  7 23:49:34 backup3 kernel: siisch1:  ... waiting for slots 0300
Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 25
Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068
Feb  7 23:49:34 backup3 kernel: siisch1:  ... waiting for slots 0100
Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 24
Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068


This indicates the controller on channel 1 (siisch1) is "stalled"
waiting for underlying communication with the device attached to it.


Feb  7 23:57:59 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:13:36 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:21:53 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:22:16 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:39:13 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 01:24:25 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 01:33:52 backup3 last message repeated 2 times
Feb  8 01:43:45 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 01:50:31 backup3 last message repeated 2 times
Feb  8 01:55:20 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 02:26:26 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 02:27:24 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 03:16:28 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 03:36:20 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 04:04:05 backup3 kernel: siisch1: Error while READ LOG EXT


This indicates the underlying device was handed a READ LOG EXT ATA
command (command 0x2f) and the device did not respond promptly
(resulting in the timeout messages you see).


There are hours between timeouts and READ LOG EXT errors. they are
not directly related, but may have the same reason.


smartctl doesnt show any issues on the drives other than one that has some historical 
errors from a 

Re: siisch1: Error while READ LOG EXT

2012-02-08 Thread Jeremy Chadwick
On Thu, Feb 09, 2012 at 12:22:40AM +0200, Alexander Motin wrote:
> On 08.02.2012 23:27, Jeremy Chadwick wrote:
> >On Wed, Feb 08, 2012 at 04:00:57PM -0500, Mike Tancsa wrote:
> >>I have a 4 port eSata PCIe card with 3 external port multipliers attached 
> >>on an AMD64 box (8G of RAM), RELENG8 from Feb1st.
> >>
> >>siis0@pci0:5:0:0:   class=0x010400 card=0x71241095 chip=0x31241095 
> >>rev=0x02 hdr=0x00
> >> vendor = 'Silicon Image Inc (Was: CMD Technology Inc)'
> >> device = 'PCI-X to Serial ATA Controller (SiI 3124)'
> >> class  = mass storage
> >> subclass   = RAID
> >> bar   [10] = type Memory, range 64, base 0xb4408000, size 128, enabled
> >> bar   [18] = type Memory, range 64, base 0xb440, size 32768, 
> >> enabled
> >> bar   [20] = type I/O Port, range 32, base 0x3000, size 16, enabled
> >> cap 01[64] = powerspec 2  supports D0 D1 D2 D3  current D0
> >> cap 07[40] = PCI-X 64-bit supports 133MHz, 2048 burst read, 12 split 
> >> transactions
> >> cap 05[54] = MSI supports 1 message, 64 bit enabled with 1 message
> >>
> >>siis0:  port 0x3000-0x300f mem 
> >>0xb4408000-0xb440807f,0xb440-0xb4407fff irq 19 at device 0.0 on pci5
> >>siis0: [ITHREAD]
> >>siisch0:  at channel 0 on siis0
> >>siisch0: [ITHREAD]
> >>siisch1:  at channel 1 on siis0
> >>siisch1: [ITHREAD]
> >>siisch2:  at channel 2 on siis0
> >>siisch2: [ITHREAD]
> >>siisch3:  at channel 3 on siis0
> >>siisch3: [ITHREAD]
> >>
> >># camcontrol devlist
> >>at scbus0 target 0 lun 0 (pass0,ada0)
> >>at scbus0 target 1 lun 0 (pass1,ada1)
> >>at scbus0 target 2 lun 0 (pass2,ada2)
> >>at scbus0 target 3 lun 0 (pass3,ada3)
> >> at scbus0 target 15 lun 0 (pass4,pmp1)
> >>at scbus1 target 0 lun 0 (pass5,ada4)
> >>at scbus1 target 1 lun 0 (pass6,ada5)
> >>at scbus1 target 2 lun 0 (pass7,ada6)
> >>at scbus1 target 3 lun 0 (pass8,ada7)
> >>at scbus1 target 4 lun 0 (pass9,ada8)
> >> at scbus1 target 15 lun 0 (pass10,pmp0)
> >> at scbus4 target 0 lun 0 (pass11,da0)
> >>at scbus4 target 0 lun 1 (pass12,da1)
> >>at scbus4 target 16 lun 0 (pass13)
> >> at scbus5 target 0 lun 0 (pass14,da2)
> >> at scbus6 target 0 lun 0 (pass15,ada9)
> >> at scbus7 target 0 lun 0 (pass16,ada10)
> >> at scbus8 target 0 lun 0 (pass17,ada11)
> >>at scbus11 target 0 lun 0 (pass18,ada12)
> >>
> >>
> >>Ever since I added a new PM, I have been seeing a new error (READ LOG EXT) 
> >>along with a the odd slot timeout error.
> >>
> >>
> >>Feb  7 23:49:32 backup3 kernel: siisch1:  ... waiting for slots 4700
> >>Feb  7 23:49:32 backup3 kernel: siisch1: Timeout on slot 26
> >>Feb  7 23:49:32 backup3 kernel: siisch1: siis_timeout is 0704 ss 
> >>7f17e8b9 rs 7f17e8b9 es  sts 801d2000 serr 0068
> >>Feb  7 23:49:32 backup3 kernel: siisch1:  ... waiting for slots 4300
> >>Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 30
> >>Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 
> >>7f17e8b9 rs 7f17e8b9 es  sts 801d2000 serr 0068
> >>Feb  7 23:49:34 backup3 kernel: siisch1:  ... waiting for slots 0300
> >>Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 25
> >>Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 
> >>7f17e8b9 rs 7f17e8b9 es  sts 801d2000 serr 0068
> >>Feb  7 23:49:34 backup3 kernel: siisch1:  ... waiting for slots 0100
> >>Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 24
> >>Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 
> >>7f17e8b9 rs 7f17e8b9 es  sts 801d2000 serr 0068
> >
> >This indicates the controller on channel 1 (siisch1) is "stalled"
> >waiting for underlying communication with the device attached to it.
> >
> >>Feb  7 23:57:59 backup3 kernel: siisch1: Error while READ LOG EXT
> >>Feb  8 00:13:36 backup3 kernel: siisch1: Error while READ LOG EXT
> >>Feb  8 00:21:53 backup3 kernel: siisch1: Error while READ LOG EXT
> >>Feb  8 00:22:16 backup3 kernel: siisch1: Error while READ LOG EXT
> >>Feb  8 00:39:13 backup3 kernel: siisch1: Error while READ LOG EXT
> >>Feb  8 01:24:25 backup3 kernel: siisch1: Error while READ LOG EXT
> >>Feb  8 01:33:52 backup3 last message repeated 2 times
> >>Feb  8 01:43:45 backup3 kernel: siisch1: Error while READ LOG EXT
> >>Feb  8 01:50:31 backup3 last message repeated 2 times
> >>Feb  8 01:55:20 backup3 kernel: siisch1: Error while READ LOG EXT
> >>Feb  8 02:26:26 backup3 kernel: siisch1: Error while READ LOG EXT
> >>Feb  8 02:27:24 backup3 kernel: siisch1: Error while READ LOG EXT
> >>Feb  8 03:16:28 backup3 kernel: siisch1: Error while READ LOG EXT
> >>Feb  8 03:36:20 backup3 kernel: siisch1: Error while READ LOG EXT
> >>Feb  8 04:04:05 backup3 kernel: siisch1: Error while READ LOG EXT
> >
> >This indicates the underlying device was handed a READ LOG EXT ATA
> >command (command 0x2f) and the 

Re: siisch1: Error while READ LOG EXT

2012-02-08 Thread Alexander Motin

On 08.02.2012 23:27, Jeremy Chadwick wrote:

On Wed, Feb 08, 2012 at 04:00:57PM -0500, Mike Tancsa wrote:

I have a 4 port eSata PCIe card with 3 external port multipliers attached on an 
AMD64 box (8G of RAM), RELENG8 from Feb1st.

siis0@pci0:5:0:0:   class=0x010400 card=0x71241095 chip=0x31241095 rev=0x02 
hdr=0x00
 vendor = 'Silicon Image Inc (Was: CMD Technology Inc)'
 device = 'PCI-X to Serial ATA Controller (SiI 3124)'
 class  = mass storage
 subclass   = RAID
 bar   [10] = type Memory, range 64, base 0xb4408000, size 128, enabled
 bar   [18] = type Memory, range 64, base 0xb440, size 32768, enabled
 bar   [20] = type I/O Port, range 32, base 0x3000, size 16, enabled
 cap 01[64] = powerspec 2  supports D0 D1 D2 D3  current D0
 cap 07[40] = PCI-X 64-bit supports 133MHz, 2048 burst read, 12 split 
transactions
 cap 05[54] = MSI supports 1 message, 64 bit enabled with 1 message

siis0:  port 0x3000-0x300f mem 
0xb4408000-0xb440807f,0xb440-0xb4407fff irq 19 at device 0.0 on pci5
siis0: [ITHREAD]
siisch0:  at channel 0 on siis0
siisch0: [ITHREAD]
siisch1:  at channel 1 on siis0
siisch1: [ITHREAD]
siisch2:  at channel 2 on siis0
siisch2: [ITHREAD]
siisch3:  at channel 3 on siis0
siisch3: [ITHREAD]

# camcontrol devlist
at scbus0 target 0 lun 0 (pass0,ada0)
at scbus0 target 1 lun 0 (pass1,ada1)
at scbus0 target 2 lun 0 (pass2,ada2)
at scbus0 target 3 lun 0 (pass3,ada3)
 at scbus0 target 15 lun 0 (pass4,pmp1)
at scbus1 target 0 lun 0 (pass5,ada4)
at scbus1 target 1 lun 0 (pass6,ada5)
at scbus1 target 2 lun 0 (pass7,ada6)
at scbus1 target 3 lun 0 (pass8,ada7)
at scbus1 target 4 lun 0 (pass9,ada8)
 at scbus1 target 15 lun 0 (pass10,pmp0)
 at scbus4 target 0 lun 0 (pass11,da0)
at scbus4 target 0 lun 1 (pass12,da1)
at scbus4 target 16 lun 0 (pass13)
 at scbus5 target 0 lun 0 (pass14,da2)
 at scbus6 target 0 lun 0 (pass15,ada9)
 at scbus7 target 0 lun 0 (pass16,ada10)
 at scbus8 target 0 lun 0 (pass17,ada11)
at scbus11 target 0 lun 0 (pass18,ada12)


Ever since I added a new PM, I have been seeing a new error (READ LOG EXT) 
along with a the odd slot timeout error.


Feb  7 23:49:32 backup3 kernel: siisch1:  ... waiting for slots 4700
Feb  7 23:49:32 backup3 kernel: siisch1: Timeout on slot 26
Feb  7 23:49:32 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068
Feb  7 23:49:32 backup3 kernel: siisch1:  ... waiting for slots 4300
Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 30
Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068
Feb  7 23:49:34 backup3 kernel: siisch1:  ... waiting for slots 0300
Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 25
Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068
Feb  7 23:49:34 backup3 kernel: siisch1:  ... waiting for slots 0100
Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 24
Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068


This indicates the controller on channel 1 (siisch1) is "stalled"
waiting for underlying communication with the device attached to it.


Feb  7 23:57:59 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:13:36 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:21:53 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:22:16 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:39:13 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 01:24:25 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 01:33:52 backup3 last message repeated 2 times
Feb  8 01:43:45 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 01:50:31 backup3 last message repeated 2 times
Feb  8 01:55:20 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 02:26:26 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 02:27:24 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 03:16:28 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 03:36:20 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 04:04:05 backup3 kernel: siisch1: Error while READ LOG EXT


This indicates the underlying device was handed a READ LOG EXT ATA
command (command 0x2f) and the device did not respond promptly
(resulting in the timeout messages you see).


There are hours between timeouts and READ LOG EXT errors. they are not 
directly related, but may have the same reason.



smartctl doesnt show any issues on the drives other than one that has some historical 
errors from a while ago.  What are these errors and do I need to worry about them ? The 
"READ LOG EXT" ones are new.

{snipping SMART stats}


You

Re: siisch1: Error while READ LOG EXT

2012-02-08 Thread Jeremy Chadwick
On Wed, Feb 08, 2012 at 01:27:23PM -0800, Jeremy Chadwick wrote:
> On Wed, Feb 08, 2012 at 04:00:57PM -0500, Mike Tancsa wrote:
> > Ever since I added a new PM, I have been seeing a new error (READ LOG EXT) 
> > along with a the odd slot timeout error.

BTW, something I forgot to cover in my reply: the slot number shown in
the output (e.g. "Timeout on slot NN") has nothing to do with "port
number", "connector", or anything like that.  It's an internal
controller feature; AHCI offers the same thing.  I performed rudimentary
analysis on this back in April 2011 by reviewing the code and a small
write-up on it (semi-technical):

http://lists.freebsd.org/pipermail/freebsd-fs/2011-April/011197.html

Taken from my post at that time, which is what I'm wanting to relay
here: "Timeout on slot N" != SATA port N.  Two unrelated things.

-- 
| Jeremy Chadwick j...@parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, US |
| Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: siisch1: Error while READ LOG EXT

2012-02-08 Thread Jeremy Chadwick
On Wed, Feb 08, 2012 at 04:00:57PM -0500, Mike Tancsa wrote:
> I have a 4 port eSata PCIe card with 3 external port multipliers attached on 
> an AMD64 box (8G of RAM), RELENG8 from Feb1st.
> 
> siis0@pci0:5:0:0:   class=0x010400 card=0x71241095 chip=0x31241095 
> rev=0x02 hdr=0x00
> vendor = 'Silicon Image Inc (Was: CMD Technology Inc)'
> device = 'PCI-X to Serial ATA Controller (SiI 3124)'
> class  = mass storage
> subclass   = RAID
> bar   [10] = type Memory, range 64, base 0xb4408000, size 128, enabled
> bar   [18] = type Memory, range 64, base 0xb440, size 32768, enabled
> bar   [20] = type I/O Port, range 32, base 0x3000, size 16, enabled
> cap 01[64] = powerspec 2  supports D0 D1 D2 D3  current D0
> cap 07[40] = PCI-X 64-bit supports 133MHz, 2048 burst read, 12 split 
> transactions
> cap 05[54] = MSI supports 1 message, 64 bit enabled with 1 message
> 
> siis0:  port 0x3000-0x300f mem 
> 0xb4408000-0xb440807f,0xb440-0xb4407fff irq 19 at device 0.0 on pci5
> siis0: [ITHREAD]
> siisch0:  at channel 0 on siis0
> siisch0: [ITHREAD]
> siisch1:  at channel 1 on siis0
> siisch1: [ITHREAD]
> siisch2:  at channel 2 on siis0
> siisch2: [ITHREAD]
> siisch3:  at channel 3 on siis0
> siisch3: [ITHREAD]
> 
> # camcontrol devlist
>at scbus0 target 0 lun 0 (pass0,ada0)
>at scbus0 target 1 lun 0 (pass1,ada1)
>at scbus0 target 2 lun 0 (pass2,ada2)
>at scbus0 target 3 lun 0 (pass3,ada3)
> at scbus0 target 15 lun 0 (pass4,pmp1)
>at scbus1 target 0 lun 0 (pass5,ada4)
>at scbus1 target 1 lun 0 (pass6,ada5)
>at scbus1 target 2 lun 0 (pass7,ada6)
>at scbus1 target 3 lun 0 (pass8,ada7)
>at scbus1 target 4 lun 0 (pass9,ada8)
> at scbus1 target 15 lun 0 (pass10,pmp0)
> at scbus4 target 0 lun 0 (pass11,da0)
>at scbus4 target 0 lun 1 (pass12,da1)
>at scbus4 target 16 lun 0 (pass13)
> at scbus5 target 0 lun 0 (pass14,da2)
> at scbus6 target 0 lun 0 (pass15,ada9)
> at scbus7 target 0 lun 0 (pass16,ada10)
> at scbus8 target 0 lun 0 (pass17,ada11)
>at scbus11 target 0 lun 0 (pass18,ada12)
> 
> 
> Ever since I added a new PM, I have been seeing a new error (READ LOG EXT) 
> along with a the odd slot timeout error.
> 
> 
> Feb  7 23:49:32 backup3 kernel: siisch1:  ... waiting for slots 4700
> Feb  7 23:49:32 backup3 kernel: siisch1: Timeout on slot 26
> Feb  7 23:49:32 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
> rs 7f17e8b9 es  sts 801d2000 serr 0068
> Feb  7 23:49:32 backup3 kernel: siisch1:  ... waiting for slots 4300
> Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 30
> Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
> rs 7f17e8b9 es  sts 801d2000 serr 0068
> Feb  7 23:49:34 backup3 kernel: siisch1:  ... waiting for slots 0300
> Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 25
> Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
> rs 7f17e8b9 es  sts 801d2000 serr 0068
> Feb  7 23:49:34 backup3 kernel: siisch1:  ... waiting for slots 0100
> Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 24
> Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
> rs 7f17e8b9 es  sts 801d2000 serr 0068

This indicates the controller on channel 1 (siisch1) is "stalled"
waiting for underlying communication with the device attached to it.

> Feb  7 23:57:59 backup3 kernel: siisch1: Error while READ LOG EXT
> Feb  8 00:13:36 backup3 kernel: siisch1: Error while READ LOG EXT
> Feb  8 00:21:53 backup3 kernel: siisch1: Error while READ LOG EXT
> Feb  8 00:22:16 backup3 kernel: siisch1: Error while READ LOG EXT
> Feb  8 00:39:13 backup3 kernel: siisch1: Error while READ LOG EXT
> Feb  8 01:24:25 backup3 kernel: siisch1: Error while READ LOG EXT
> Feb  8 01:33:52 backup3 last message repeated 2 times
> Feb  8 01:43:45 backup3 kernel: siisch1: Error while READ LOG EXT
> Feb  8 01:50:31 backup3 last message repeated 2 times
> Feb  8 01:55:20 backup3 kernel: siisch1: Error while READ LOG EXT
> Feb  8 02:26:26 backup3 kernel: siisch1: Error while READ LOG EXT
> Feb  8 02:27:24 backup3 kernel: siisch1: Error while READ LOG EXT
> Feb  8 03:16:28 backup3 kernel: siisch1: Error while READ LOG EXT
> Feb  8 03:36:20 backup3 kernel: siisch1: Error while READ LOG EXT
> Feb  8 04:04:05 backup3 kernel: siisch1: Error while READ LOG EXT

This indicates the underlying device was handed a READ LOG EXT ATA
command (command 0x2f) and the device did not respond promptly
(resulting in the timeout messages you see).

> smartctl doesnt show any issues on the drives other than one that has some 
> historical errors from a while ago.  What are these errors and do I need to 
> worry about them ? The "READ LOG EXT" ones are new.
>
> {snipping SMART stats}

You're focused heavily on 

Re: zfs arc and amount of wired memory

2012-02-08 Thread Andriy Gapon
on 08/02/2012 22:50 Jeremy Chadwick said the following:
> Politely -- recommending this to a user is a good choice of action, but
> the problem is that no user, even an experienced user, is going to know
> what all of the "Types" (vmstat -m) or "ITEMs" (vmstat -z) correlate
> with on the system.

I see no problem with users sharing the output and asking for help interpreting
it.  I do not know of any easier way to analyze problems like this one.

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


siisch1: Error while READ LOG EXT

2012-02-08 Thread Mike Tancsa
I have a 4 port eSata PCIe card with 3 external port multipliers attached on an 
AMD64 box (8G of RAM), RELENG8 from Feb1st.

siis0@pci0:5:0:0:   class=0x010400 card=0x71241095 chip=0x31241095 rev=0x02 
hdr=0x00
vendor = 'Silicon Image Inc (Was: CMD Technology Inc)'
device = 'PCI-X to Serial ATA Controller (SiI 3124)'
class  = mass storage
subclass   = RAID
bar   [10] = type Memory, range 64, base 0xb4408000, size 128, enabled
bar   [18] = type Memory, range 64, base 0xb440, size 32768, enabled
bar   [20] = type I/O Port, range 32, base 0x3000, size 16, enabled
cap 01[64] = powerspec 2  supports D0 D1 D2 D3  current D0
cap 07[40] = PCI-X 64-bit supports 133MHz, 2048 burst read, 12 split 
transactions
cap 05[54] = MSI supports 1 message, 64 bit enabled with 1 message

siis0:  port 0x3000-0x300f mem 
0xb4408000-0xb440807f,0xb440-0xb4407fff irq 19 at device 0.0 on pci5
siis0: [ITHREAD]
siisch0:  at channel 0 on siis0
siisch0: [ITHREAD]
siisch1:  at channel 1 on siis0
siisch1: [ITHREAD]
siisch2:  at channel 2 on siis0
siisch2: [ITHREAD]
siisch3:  at channel 3 on siis0
siisch3: [ITHREAD]

# camcontrol devlist
   at scbus0 target 0 lun 0 (pass0,ada0)
   at scbus0 target 1 lun 0 (pass1,ada1)
   at scbus0 target 2 lun 0 (pass2,ada2)
   at scbus0 target 3 lun 0 (pass3,ada3)
at scbus0 target 15 lun 0 (pass4,pmp1)
   at scbus1 target 0 lun 0 (pass5,ada4)
   at scbus1 target 1 lun 0 (pass6,ada5)
   at scbus1 target 2 lun 0 (pass7,ada6)
   at scbus1 target 3 lun 0 (pass8,ada7)
   at scbus1 target 4 lun 0 (pass9,ada8)
at scbus1 target 15 lun 0 (pass10,pmp0)
at scbus4 target 0 lun 0 (pass11,da0)
   at scbus4 target 0 lun 1 (pass12,da1)
   at scbus4 target 16 lun 0 (pass13)
at scbus5 target 0 lun 0 (pass14,da2)
at scbus6 target 0 lun 0 (pass15,ada9)
at scbus7 target 0 lun 0 (pass16,ada10)
at scbus8 target 0 lun 0 (pass17,ada11)
   at scbus11 target 0 lun 0 (pass18,ada12)


Ever since I added a new PM, I have been seeing a new error (READ LOG EXT) 
along with a the odd slot timeout error.


Feb  7 23:49:32 backup3 kernel: siisch1:  ... waiting for slots 4700
Feb  7 23:49:32 backup3 kernel: siisch1: Timeout on slot 26
Feb  7 23:49:32 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068
Feb  7 23:49:32 backup3 kernel: siisch1:  ... waiting for slots 4300
Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 30
Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068
Feb  7 23:49:34 backup3 kernel: siisch1:  ... waiting for slots 0300
Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 25
Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068
Feb  7 23:49:34 backup3 kernel: siisch1:  ... waiting for slots 0100
Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 24
Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068
Feb  7 23:57:59 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:13:36 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:21:53 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:22:16 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:39:13 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 01:24:25 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 01:33:52 backup3 last message repeated 2 times
Feb  8 01:43:45 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 01:50:31 backup3 last message repeated 2 times
Feb  8 01:55:20 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 02:26:26 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 02:27:24 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 03:16:28 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 03:36:20 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 04:04:05 backup3 kernel: siisch1: Error while READ LOG EXT


smartctl doesnt show any issues on the drives other than one that has some 
historical errors from a while ago.  What are these errors and do I need to 
worry about them ? The "READ LOG EXT" ones are new.


This is the only drive with anything in its logs so not sure if this is causing 
the driver to complain

 smartctl -a /dev/ada9
smartctl 5.41 2011-06-09 r3365 [FreeBSD 8.2-STABLE amd64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.11
Device Model: ST31000333AS
Serial Number:9TE14SRV
LU WWN Device Id: 5 000c50 010a39664
Firmware Version: SD35
User Capacity:1,000,204,886,016 bytes [1.00 TB]
Sector Size:  512 bytes logical/physical
Device is:In smartctl d

Re: zfs arc and amount of wired memory

2012-02-08 Thread Jeremy Chadwick
On Wed, Feb 08, 2012 at 10:29:36PM +0200, Andriy Gapon wrote:
> on 08/02/2012 12:31 Eugene M. Zheganin said the following:
> > Hi.
> > 
> > On 08.02.2012 02:17, Andriy Gapon wrote:
> >> [output snipped]
> >>
> >> Thank you.  I don't see anything suspicious/unusual there.
> >> Just case, do you have ZFS dedup enabled by a chance?
> >>
> >> I think that examination of vmstat -m and vmstat -z outputs may provide 
> >> some
> >> clues as to what got all that memory wired.
> >>
> > Nope, I don't have deduplication feature enabled.
> 
> OK.  So, did you have a chance to inspect vmstat -m and vmstat -z?

Andriy,

Politely -- recommending this to a user is a good choice of action, but
the problem is that no user, even an experienced user, is going to know
what all of the "Types" (vmstat -m) or "ITEMs" (vmstat -z) correlate
with on the system.

For example, for vmstat -m, the ITEM name is "solaris".  For vmstat -z,
the Types are named zio_* but I have a feeling there are more than just
that which pertain to ZFS.  I'm having to make *assumptions*.

The FreeBSD VM is highly complex and is not "easy to understand" even
remotely.  It becomes more complex when you consider that we use terms
like "wired", "active", "inactive", "cache", and "free" -- and none of
them, in simple English terms, actually represent the words chosen for
what they do.

Furthermore, the only definition I've been able to find over the years
for how any of these work, what they do/mean, etc. is here:

http://www.freebsd.org/doc/en/books/arch-handbook/vm.html

And this piece of documentation is only useful for people who understand
VMs (note: it was written by Matt Dillon, for example).  It is not
useful for end-users trying to track down what within the kernel is
actually eating up memory.  "vmstat -m" is as best as it's going to get,
and like I said, with the ITEM names being borderline ambiguous
(depending on what you're looking for -- with VFS and so on it's spread
all over the place), this becomes a very tedious task, where the user or
admin have to continually ask developers on the mailing lists what it is
they're looking at.

-- 
| Jeremy Chadwick j...@parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, US |
| Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zfs arc and amount of wired memory

2012-02-08 Thread Andriy Gapon
on 08/02/2012 12:31 Eugene M. Zheganin said the following:
> Hi.
> 
> On 08.02.2012 02:17, Andriy Gapon wrote:
>> [output snipped]
>>
>> Thank you.  I don't see anything suspicious/unusual there.
>> Just case, do you have ZFS dedup enabled by a chance?
>>
>> I think that examination of vmstat -m and vmstat -z outputs may provide some
>> clues as to what got all that memory wired.
>>
> Nope, I don't have deduplication feature enabled.

OK.  So, did you have a chance to inspect vmstat -m and vmstat -z?

> By the way, today, after eating another 100M of wired memory this server 
> hanged
> out with multiple non-stopping messages
> 
> swap_pager: indefinite wait buffer
> 
> Since it's swapping on zvol, it looks to me like it could be the mentioned in
> another thread here ("Swap on zvol - recommendable?") resource starvation 
> issue;
> may be it happens faster when the ARC isn't limited.

It could be very well possible that swap on zvol doesn't work well when the
kernel itself is starved on memory.

> So I want to ask - how to report it and what should I include in such pr ?

I am leaving swap-on-zvol issue aside.  Your original problem doesn't seem to be
ZFS-related.  I suspect that you might be running into some kernel memory leak.
 If you manage to reproduce the high wired value again, then vmstat -m and
vmstat -z may provide some useful information.

In this vein, do you use any out-of-tree kernel modules?
Also, can you try to monitor your system to see when wired count grows?

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zfs arc and amount of wired memory

2012-02-08 Thread Freddie Cash
On Wed, Feb 8, 2012 at 10:40 AM, Freddie Cash  wrote:
> On Wed, Feb 8, 2012 at 10:25 AM, Eugene M. Zheganin  
> wrote:
>> On 08.02.2012 18:15, Alexander Leidinger wrote:
>>> I can't remember to have seen any mention of SWAP on ZFS being safe
>>> now. So if nobody can provide a reference to a place which tells that
>>> the problems with SWAP on ZFS are fixed:
>>>  1. do not use SWAP on ZFS
>>>  2. see 1.
>>>  3. check if you see the same problem without SWAP on ZFS (btw. see 1.)
>>>
>> So, if a swap have to be used, and, it has to be backed up with something
>> like gmirror so it won't come down with one of the disks, there's no need to
>> use zfs for system.
>>
>> This makes zfs only useful in cases where you need to store something on a
>> couple+ of terabytes, still having OS on ufs. Occam's razor and so on.
>
> Or, you plug a USB stick into the back (or even inside the case as a
> lot of mobos have internal USB connectors now) and use that for swap.

That also works well for adding L2ARC (cache) to the ZFS pool as well.

-- 
Freddie Cash
fjwc...@gmail.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zfs arc and amount of wired memory

2012-02-08 Thread Freddie Cash
On Wed, Feb 8, 2012 at 10:25 AM, Eugene M. Zheganin  wrote:
> On 08.02.2012 18:15, Alexander Leidinger wrote:
>> I can't remember to have seen any mention of SWAP on ZFS being safe
>> now. So if nobody can provide a reference to a place which tells that
>> the problems with SWAP on ZFS are fixed:
>>  1. do not use SWAP on ZFS
>>  2. see 1.
>>  3. check if you see the same problem without SWAP on ZFS (btw. see 1.)
>>
> So, if a swap have to be used, and, it has to be backed up with something
> like gmirror so it won't come down with one of the disks, there's no need to
> use zfs for system.
>
> This makes zfs only useful in cases where you need to store something on a
> couple+ of terabytes, still having OS on ufs. Occam's razor and so on.

Or, you plug a USB stick into the back (or even inside the case as a
lot of mobos have internal USB connectors now) and use that for swap.

-- 
Freddie Cash
fjwc...@gmail.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zfs arc and amount of wired memory

2012-02-08 Thread Eugene M. Zheganin

Hi.

On 08.02.2012 18:15, Alexander Leidinger wrote:

I can't remember to have seen any mention of SWAP on ZFS being safe
now. So if nobody can provide a reference to a place which tells that
the problems with SWAP on ZFS are fixed:
  1. do not use SWAP on ZFS
  2. see 1.
  3. check if you see the same problem without SWAP on ZFS (btw. see 1.)

So, if a swap have to be used, and, it has to be backed up with 
something like gmirror so it won't come down with one of the disks, 
there's no need to use zfs for system.


This makes zfs only useful in cases where you need to store something on 
a couple+ of terabytes, still having OS on ufs. Occam's razor and so on.


Thanks for explanation.
Eugene.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: FreeBSD 8.2-stable: devd fails to restart

2012-02-08 Thread Torfinn Ingolfsen
On Tue, 07 Feb 2012 12:16:15 -0700 (MST)
Warren Block  wrote:

> 
> It's devd, IMO.  Hey, come to think of it, I did enter a PR, the one 
> above.  If this is still a problem in 9 (which I can test in a bit), 
> posting to -current might get some needed attention on it.

PR updated.
-- 
Torfinn

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: i18n not working during startup

2012-02-08 Thread Gala IT
Hi Victor,

Try setting tomcat7_java_opts="-Dfile.encoding=UTF-8" in /etc/rc.conf.

It works for us under 8.2.

Kind regards,
David.

El 08/02/2012, a les 13:11, Victor Balada Diaz va escriure:

> Hello,
> 
> I tried freebsd-i18n but no one answered, so i will try better luck here. 
> Sorry
> for the people who are subscribed to both lists.
> 
> - Forwarded message from Victor Balada Diaz  -
> 
> Date: Thu, 2 Feb 2012 19:17:21 +0100
> From: Victor Balada Diaz 
> To: freebsd-i...@freebsd.org
> Subject: i18n not working during startup
> User-Agent: Mutt/1.5.21 (2010-09-15)
> 
> Hello,
> 
> I've setup login classes by handbook recommendation but seems that daemons 
> started by rc
> at system bootup don't use it. What i'm actually trying to do is configure 
> tomcat to
> use UTF-8 by default. I've configured it's user class on /etc/login.conf 
> adding:
> 
>:setenv=LC_ALL=en_US.UTF-8:\
>:lang=en_US.UTF-8:\
> 
> rebuilt login.conf db and tried rebooting. It doesn't seem to have lang or 
> lc_all set
> in their environment. As a workaround i thought about adding export lines at 
> start of
> /etc/rc.conf, but that's an ugly hack. 
> 
> Is there any other way of setting up lang settings for system startup daemons?
> 
> FreeBSD version: 7.4
> Arch: amd64
> 
> Thanks a lot.
> Regards
> -- 
> La prueba más fehaciente de que existe vida inteligente en otros
> planetas, es que no han intentado contactar con nosotros. 
> 
> - End forwarded message -
> 
> -- 
> La prueba más fehaciente de que existe vida inteligente en otros
> planetas, es que no han intentado contactar con nosotros. 
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


i18n not working during startup

2012-02-08 Thread Victor Balada Diaz
Hello,

I tried freebsd-i18n but no one answered, so i will try better luck here. Sorry
for the people who are subscribed to both lists.

- Forwarded message from Victor Balada Diaz  -

Date: Thu, 2 Feb 2012 19:17:21 +0100
From: Victor Balada Diaz 
To: freebsd-i...@freebsd.org
Subject: i18n not working during startup
User-Agent: Mutt/1.5.21 (2010-09-15)

Hello,

I've setup login classes by handbook recommendation but seems that daemons 
started by rc
at system bootup don't use it. What i'm actually trying to do is configure 
tomcat to
use UTF-8 by default. I've configured it's user class on /etc/login.conf adding:

:setenv=LC_ALL=en_US.UTF-8:\
:lang=en_US.UTF-8:\

rebuilt login.conf db and tried rebooting. It doesn't seem to have lang or 
lc_all set
in their environment. As a workaround i thought about adding export lines at 
start of
/etc/rc.conf, but that's an ugly hack. 

Is there any other way of setting up lang settings for system startup daemons?

FreeBSD version: 7.4
Arch: amd64

Thanks a lot.
Regards
-- 
La prueba más fehaciente de que existe vida inteligente en otros
planetas, es que no han intentado contactar con nosotros. 

- End forwarded message -

-- 
La prueba más fehaciente de que existe vida inteligente en otros
planetas, es que no han intentado contactar con nosotros. 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zfs arc and amount of wired memory

2012-02-08 Thread Alexander Leidinger
On Wed, 08 Feb 2012 16:31:44 +0600 "Eugene M. Zheganin"
 wrote:

> swap_pager: indefinite wait buffer
> 
> Since it's swapping on zvol, it looks to me like it could be the 
> mentioned in another thread here ("Swap on zvol - recommendable?") 
> resource starvation issue; may be it happens faster when the ARC
> isn't limited.
> 
> So I want to ask - how to report it and what should I include in such
> pr ?

I can't remember to have seen any mention of SWAP on ZFS being safe
now. So if nobody can provide a reference to a place which tells that
the problems with SWAP on ZFS are fixed:
 1. do not use SWAP on ZFS
 2. see 1.
 3. check if you see the same problem without SWAP on ZFS (btw. see 1.)

Bye,
Alexander.


-- 
http://www.Leidinger.netAlexander @ Leidinger.net: PGP ID = B0063FE7
http://www.FreeBSD.org   netchild @ FreeBSD.org  : PGP ID = 72077137
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: MFC misc/124164 (Add SHA-256/512 hash algorithm to crypt(3)) to stable/8?

2012-02-08 Thread Mark Murray
Tim Bishop writes:
> Are there any committers willing to merge PR misc/124164 to stable/8
> before the 8.3 release freeze? It's already in HEAD and stable/9 so it's
> had some testing.
> 
> misc/124164 adds support for SHA256/512 to crypt(3). This is something
> we make use of on Linux and FreeBSD 9, and it'd be great to have the
> same support on FreeBSD 8.
> 
> http://www.freebsd.org/cgi/query-pr.cgi?pr=124164
> 
> SVN Revs: 220496 220497
> 
> I've tried markm@ already and had no response.

Apologies - I'll get to it ASAP.

M
--
Mark R V Murray
Cert APS(Open) Dip Phys(Open) BSc Open(Open) BSc(Hons)(Open)
Pi: 132511160

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zfs arc and amount of wired memory

2012-02-08 Thread Eugene M. Zheganin

Hi.

On 08.02.2012 02:17, Andriy Gapon wrote:

[output snipped]

Thank you.  I don't see anything suspicious/unusual there.
Just case, do you have ZFS dedup enabled by a chance?

I think that examination of vmstat -m and vmstat -z outputs may provide some
clues as to what got all that memory wired.


Nope, I don't have deduplication feature enabled.

By the way, today, after eating another 100M of wired memory this server 
hanged out with multiple non-stopping messages


swap_pager: indefinite wait buffer

Since it's swapping on zvol, it looks to me like it could be the 
mentioned in another thread here ("Swap on zvol - recommendable?") 
resource starvation issue; may be it happens faster when the ARC isn't 
limited.


So I want to ask - how to report it and what should I include in such pr ?

Thanks.
Eugene.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: kernel debugging and ULE

2012-02-08 Thread Julian Elischer

On 2/7/12 1:50 AM, Andriy Gapon wrote:

on 06/02/2012 07:52 Julian Elischer said the following:

so if I'm sitting still in the debugger for too long, a hardclock
event happens that goes into ULE, which then hits the following KASSERT.


KASSERT(pri>= PRI_MIN_BATCH&&  pri<= PRI_MAX_BATCH,
 ("sched_priority: invalid priority %d: nice %d, "
 "ticks %d ftick %d ltick %d tick pri %d",
 pri, td->td_proc->p_nice, td->td_sched->ts_ticks,
 td->td_sched->ts_ftick, td->td_sched->ts_ltick,
 SCHED_PRI_TICKS(td->td_sched)));


The reason seems to be that I've been sitting still for too long and things have
become pear shaped.


how is it that being in the debugger doesn't stop hardclock events?
is there something I can do to make them not happen..
It means I have to ge tmy debugging done in less than about 60 seconds.

suggesions welcome.

Does this really happen when you just sit in the debugger?
Or does it happen when you let the kernel run?  Like stepping through the code,
etc


good point.. I was doing some single stepping..

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


MFC misc/124164 (Add SHA-256/512 hash algorithm to crypt(3)) to stable/8?

2012-02-08 Thread Tim Bishop
Are there any committers willing to merge PR misc/124164 to stable/8
before the 8.3 release freeze? It's already in HEAD and stable/9 so it's
had some testing.

misc/124164 adds support for SHA256/512 to crypt(3). This is something
we make use of on Linux and FreeBSD 9, and it'd be great to have the
same support on FreeBSD 8.

http://www.freebsd.org/cgi/query-pr.cgi?pr=124164

SVN Revs: 220496 220497

I've tried markm@ already and had no response.

Thanks,

Tim.

-- 
Tim Bishop
http://www.bishnet.net/tim/
PGP Key: 0x5AE7D984
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"