Re: Stability of 11.1S

2018-03-22 Thread Dewayne Geraghty

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Stability of 11.1S

2018-03-21 Thread Freddie Cash
On Wed, Mar 21, 2018 at 7:59 AM, Warner Losh  wrote:

> On Wed, Mar 21, 2018 at 7:32 AM, George Mitchell 
> wrote:
>
> > On 03/21/18 04:51, Eitan Adler wrote:
> > > On 19 March 2018 at 22:59, Dewayne Geraghty
> > >  wrote:
> > >> [...]
> > >> PS Normally I would bisect, but we're converting 2 large PROLOG
> > applications
> > >> to erlang... (prayers welcome)
> > > [...]
> >
> > What next, converting a FORTH application to LISP?  (Sorry, couldn't
> > resist ...)
>
>
> Back in college we had a gentleman who was working on his FORTH LISP
> interpreter But I can't recall if it was a FORTH interpreter written in
> LISP or a LISP interpreter written in FORTH.
>

​What happened to his first three LISP interpreters?  ;)  If at first you
don't succeed, try try try try again?

Sorry, that one was just too easy, could not resist.​ :D
​
I'll see myself out now.  :)​



-- 
Freddie Cash
fjwc...@gmail.com
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Stability of 11.1S

2018-03-21 Thread Warner Losh
On Wed, Mar 21, 2018 at 7:32 AM, George Mitchell 
wrote:

> On 03/21/18 04:51, Eitan Adler wrote:
> > On 19 March 2018 at 22:59, Dewayne Geraghty
> >  wrote:
> >> [...]
> >> PS Normally I would bisect, but we're converting 2 large PROLOG
> applications
> >> to erlang... (prayers welcome)
> > [...]
>
> What next, converting a FORTH application to LISP?  (Sorry, couldn't
> resist ...)


Back in college we had a gentleman who was working on his FORTH LISP
interpreter But I can't recall if it was a FORTH interpreter written in
LISP or a LISP interpreter written in FORTH.

Warner
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Stability of 11.1S

2018-03-21 Thread George Mitchell
On 03/21/18 04:51, Eitan Adler wrote:
> On 19 March 2018 at 22:59, Dewayne Geraghty
>  wrote:
>> [...]
>> PS Normally I would bisect, but we're converting 2 large PROLOG applications
>> to erlang... (prayers welcome)
> [...]

What next, converting a FORTH application to LISP?  (Sorry, couldn't
resist ...)-- George



signature.asc
Description: OpenPGP digital signature


Re: Stability of 11.1S

2018-03-21 Thread Eitan Adler
On 19 March 2018 at 22:59, Dewayne Geraghty
 wrote:
> Hi Eitan,
> Agreed.   Unfortunately all I have is that it abruptly shuts down.  Both
> under load (10,8,?) - during a full package rebuild (~1200 ports); and
> during periods of idleness between 1am-2am.  From our console.log there are
> approximately 6 MARK entries in the logs, so it can be idle for that period
> of time (2 hours) before halting, abruptly.

There has been some additional conversation but just wanted to pick
something to reply to:

I will take full ownership if something I've MFCed caused breakage.
That said I'm somewhat stuck unless I am able to reproduce the issue,
or at least guess as to the cause.


> PS Normally I would bisect, but we're converting 2 large PROLOG applications
> to erlang... (prayers welcome)

Both fun languages.



-- 
Eitan Adler
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Stability of 11.1S

2018-03-20 Thread Marek Zarychta
On Tue, Mar 20, 2018 at 11:10:47AM -0700, Jeremy Chadwick wrote:
> (Please keep me CC'd as I am not subscribed to -stable)
> 
> I haven't seen any issues, but that means very little.  Details:
> 
> Two boxes -- one bare metal, one VPS (QEMU):
> 
> $ uname -a
> FreeBSD XXX 11.1-STABLE FreeBSD 11.1-STABLE #0 r330529: Tue Mar  
> 6 11:36:04 PST 2018 
> root@XXX:/usr/obj/usr/src/sys/X7SBA_RELENG_11_amd64  amd64
> $ uptime
> 10:33a.m.  up 13 days, 18:10, 2 users, load averages: 0.15, 0.19, 0.16
> 
> $ uname -a
> FreeBSD  11.1-STABLE FreeBSD 11.1-STABLE #0 r330753: Sat Mar 
> 10 21:34:20 PST 2018 
> root@:/usr/obj/usr/src/sys/_RELENG_11_amd64  amd64
> $ uptime
> 10:33a.m.  up 9 days, 10:46, 1 user, load averages: 0.31, 0.35, 0.31
> 
> Systems were updated recently because I wanted to test Meltdown/Spectre
> mitigation (more on that below).  Prior to that, bare metal was running
> 9.x with 200+ day uptimes, VPS was running 10.x with 80-90 day uptimes
> (VPS providers' HV crashed, i.e. not FreeBSD issues).
> 
> Since load averages on FreeBSD 10.x onward cannot be trusted[1][2], I
> have to explain the general system specs and loads:
> 
> Bare metal box is an Intel Core 2 Quad Q9550, 8GB RAM, doing very little
> other than running Apache + lots of cron jobs for systems stuff + ZFS
> with several disks (but not OS disk; that's a dedicated SSD w/ UFS + SU
> (not SUJ).  The cron jobs tend to stress the network and disk I/O a bit;
> ZFS gets used every day, but only "heavily" during LAN file copies
> to/from it (Samba is involved), and during nightly backups with rsync.
> 
> VPS box is some form of QEMU-based Intel Haswell CPU, 1GB RAM, doing
> general things like Apache + postfix + SpamAssassin + some other
> daemons, and a lot of Perl.  Swap is used heavily on this machine.
> Disks are all vtblk, and I use multiple to get capacity for the needed
> space for /usr/src and /usr/obj.  Everything is UFS + SU (not SUJ).
> 
> Things off the top of my head that might be relevant to you:
> 
> 1. r329462 added Meltdown/Spectre mitigation[3][4].
> 
> Bare metal box has the below in /boot/loader.conf, since this is a
> machine that does not need either given its environment:
> 
> # Disable PTI (Meltdown mitigation) and IBRS (Spectre mitigation); these
> # are not relevant on this bare-metal system given its environment and
> # use case.  Details of these tunables is here:
> # https://lists.freebsd.org/pipermail/freebsd-stable/2018-March/088526.html
> #
> vm.pmap.pti="0"
> hw.ibrs_disable="1"
> 
> VPS box has no tunings of this sort, and ends up with the below, because
> the hosting provider has no done BIOS + QEMU updates to add IBRS
> support (they're very aware of it + have attempted it twice but
> apparently it didn't go well):
> 
> vm.pmap.pti: 1
> hw.ibrs_disable: 1
> hw.ibrs_active: 0
> 
> 2. If your CPU is an AMD Ryzen, there is a VERY long discussion on
> -stable about problems with Ryzen manifesting itself in a very
> uncomfortable way, leading to system lock-ups[5].  There are unofficial
> patches you can try.  I would recommend chiming in there and not here,
> if relevant to your systems.
> 
> And yes, the massive number of MFCs that eadler@ is doing make tracking
> down exact things more tedious than normal, especially when you have
> sweeping commits like this one[6][7] (which, AFAIK, was acting as a
> major blocker for several other MFCs and causing general merge
> problems).
> 
> However, I commend his efforts; it's a massive undertaking (I would say
> full-time job).  We stable users must accept that we are running
> stable/11 for a reason -- not only to get fixes faster, but to act a
> form of "guinea pig" that don't want the risks of HEAD/CURRENT.  The
> more people using stable/11 the better overall feedback devs can get on
> bugs/issues before making it into the next -RELEASE.  This is exactly
> why, for those of you who have known me over the years, I actually
> "track" or "follow" commits as they come across.  I do this by using the
> FreshBSD site[8] alongside manual review of svnlite update output.  I
> generally know what files/bits are relevant to my interests.
> 
> Hope this gives you some things to think about.  Good luck!
> 
> [1]: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=173541#c8
> [2]: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=173541#c22
> [3]: 
> https://lists.freebsd.org/pipermail/freebsd-stable/2018-February/088396.html
> [4]: https://lists.freebsd.org/pipermail/freebsd-stable/2018-March/088526.html
> [5]: 
> https://lists.freebsd.org/pipermail/freebsd-stable/2018-January/thread.html#88174
> [6]: http://www.freshbsd.org/commit/freebsd/r330897
> [7]: https://svnweb.freebsd.org/base?view=revision=330897
> [8]: http://www.freshbsd.org/?branch=RELENG_11=freebsd
> 

I follow STABLE uprgrading regularly bunch of servers, routers, PCs,
laptop and even sometimes Raspberry Pi, always using builds made with
meta 

Re: Stability of 11.1S

2018-03-20 Thread Ian Lepore
On Tue, 2018-03-20 at 10:50 +, Pete French wrote:
> 
> On 20/03/2018 01:05, Dewayne Geraghty wrote:
> > 
> > We rebuild 11.1-Stable at least every two weeks.  Our build on the 7th
> > Feb is in use on our development boxes, however the rebuild on 22nd
> > resulted in frequent crashes and our reverting to FreeBSD 11.1-STABLE
> > r329008.  Is anyone actually running a Stable that was built after 22nd
> > Feb?  Could you please share the revision number?
> r330769 works fine for me. I usually upgrade on a Monday, though
> am holding off this week as am waiting for r330745 to land in STABLE,
> but it works fine for me always.
> 
> 
> > 
> > Because the churn in
> > https://lists.freebsd.org/pipermail/svn-src-stable-11/2018-March/ is
> > high we haven't been able to sight if a problem was identified and
> > fixed; so we're really looking for a functioning stable that we can
> > resume tracking.
> I use this to eyeball whats gone into STABLE, its a daily read for me as 
> I find keeping up with the mailing list tricky too.
> 
> 
> http://www.freshbsd.org/?branch=RELENG_11=freebsd;;
> 
> -pete.

I meant to get that done over the weekend but didn't actually get to it
until today.  I've MFC'd it to 11 as r331262, and I'm checking to see
whether it should go back to 10-stable as well.

-- Ian
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Stability of 11.1S

2018-03-20 Thread Jeremy Chadwick
(Please keep me CC'd as I am not subscribed to -stable)

I haven't seen any issues, but that means very little.  Details:

Two boxes -- one bare metal, one VPS (QEMU):

$ uname -a
FreeBSD XXX 11.1-STABLE FreeBSD 11.1-STABLE #0 r330529: Tue Mar  6 
11:36:04 PST 2018 
root@XXX:/usr/obj/usr/src/sys/X7SBA_RELENG_11_amd64  amd64
$ uptime
10:33a.m.  up 13 days, 18:10, 2 users, load averages: 0.15, 0.19, 0.16

$ uname -a
FreeBSD  11.1-STABLE FreeBSD 11.1-STABLE #0 r330753: Sat Mar 10 
21:34:20 PST 2018 
root@:/usr/obj/usr/src/sys/_RELENG_11_amd64  amd64
$ uptime
10:33a.m.  up 9 days, 10:46, 1 user, load averages: 0.31, 0.35, 0.31

Systems were updated recently because I wanted to test Meltdown/Spectre
mitigation (more on that below).  Prior to that, bare metal was running
9.x with 200+ day uptimes, VPS was running 10.x with 80-90 day uptimes
(VPS providers' HV crashed, i.e. not FreeBSD issues).

Since load averages on FreeBSD 10.x onward cannot be trusted[1][2], I
have to explain the general system specs and loads:

Bare metal box is an Intel Core 2 Quad Q9550, 8GB RAM, doing very little
other than running Apache + lots of cron jobs for systems stuff + ZFS
with several disks (but not OS disk; that's a dedicated SSD w/ UFS + SU
(not SUJ).  The cron jobs tend to stress the network and disk I/O a bit;
ZFS gets used every day, but only "heavily" during LAN file copies
to/from it (Samba is involved), and during nightly backups with rsync.

VPS box is some form of QEMU-based Intel Haswell CPU, 1GB RAM, doing
general things like Apache + postfix + SpamAssassin + some other
daemons, and a lot of Perl.  Swap is used heavily on this machine.
Disks are all vtblk, and I use multiple to get capacity for the needed
space for /usr/src and /usr/obj.  Everything is UFS + SU (not SUJ).

Things off the top of my head that might be relevant to you:

1. r329462 added Meltdown/Spectre mitigation[3][4].

Bare metal box has the below in /boot/loader.conf, since this is a
machine that does not need either given its environment:

# Disable PTI (Meltdown mitigation) and IBRS (Spectre mitigation); these
# are not relevant on this bare-metal system given its environment and
# use case.  Details of these tunables is here:
# https://lists.freebsd.org/pipermail/freebsd-stable/2018-March/088526.html
#
vm.pmap.pti="0"
hw.ibrs_disable="1"

VPS box has no tunings of this sort, and ends up with the below, because
the hosting provider has no done BIOS + QEMU updates to add IBRS
support (they're very aware of it + have attempted it twice but
apparently it didn't go well):

vm.pmap.pti: 1
hw.ibrs_disable: 1
hw.ibrs_active: 0

2. If your CPU is an AMD Ryzen, there is a VERY long discussion on
-stable about problems with Ryzen manifesting itself in a very
uncomfortable way, leading to system lock-ups[5].  There are unofficial
patches you can try.  I would recommend chiming in there and not here,
if relevant to your systems.

And yes, the massive number of MFCs that eadler@ is doing make tracking
down exact things more tedious than normal, especially when you have
sweeping commits like this one[6][7] (which, AFAIK, was acting as a
major blocker for several other MFCs and causing general merge
problems).

However, I commend his efforts; it's a massive undertaking (I would say
full-time job).  We stable users must accept that we are running
stable/11 for a reason -- not only to get fixes faster, but to act a
form of "guinea pig" that don't want the risks of HEAD/CURRENT.  The
more people using stable/11 the better overall feedback devs can get on
bugs/issues before making it into the next -RELEASE.  This is exactly
why, for those of you who have known me over the years, I actually
"track" or "follow" commits as they come across.  I do this by using the
FreshBSD site[8] alongside manual review of svnlite update output.  I
generally know what files/bits are relevant to my interests.

Hope this gives you some things to think about.  Good luck!

[1]: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=173541#c8
[2]: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=173541#c22
[3]: 
https://lists.freebsd.org/pipermail/freebsd-stable/2018-February/088396.html
[4]: https://lists.freebsd.org/pipermail/freebsd-stable/2018-March/088526.html
[5]: 
https://lists.freebsd.org/pipermail/freebsd-stable/2018-January/thread.html#88174
[6]: http://www.freshbsd.org/commit/freebsd/r330897
[7]: https://svnweb.freebsd.org/base?view=revision=330897
[8]: http://www.freshbsd.org/?branch=RELENG_11=freebsd

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to 

Re: Stability of 11.1S

2018-03-20 Thread Pete French



On 20/03/2018 01:05, Dewayne Geraghty wrote:

We rebuild 11.1-Stable at least every two weeks.  Our build on the 7th
Feb is in use on our development boxes, however the rebuild on 22nd
resulted in frequent crashes and our reverting to FreeBSD 11.1-STABLE
r329008.  Is anyone actually running a Stable that was built after 22nd
Feb?  Could you please share the revision number?


r330769 works fine for me. I usually upgrade on a Monday, though
am holding off this week as am waiting for r330745 to land in STABLE,
but it works fine for me always.



Because the churn in
https://lists.freebsd.org/pipermail/svn-src-stable-11/2018-March/ is
high we haven't been able to sight if a problem was identified and
fixed; so we're really looking for a functioning stable that we can
resume tracking.


I use this to eyeball whats gone into STABLE, its a daily read for me as 
I find keeping up with the mailing list tricky too.



http://www.freshbsd.org/?branch=RELENG_11=freebsd;

-pete.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Stability of 11.1S

2018-03-20 Thread Dewayne Geraghty

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Stability of 11.1S

2018-03-19 Thread Eitan Adler
On 19 March 2018 at 18:05, Dewayne Geraghty
 wrote:
> We rebuild 11.1-Stable at least every two weeks.  Our build on the 7th
> Feb is in use on our development boxes, however the rebuild on 22nd
> resulted in frequent crashes and our reverting to FreeBSD 11.1-STABLE
> r329008.  Is anyone actually running a Stable that was built after 22nd
> Feb?  Could you please share the revision number?
>
> Because the churn in
> https://lists.freebsd.org/pipermail/svn-src-stable-11/2018-March/ is
> high we haven't been able to sight if a problem was identified and
> fixed; so we're really looking for a functioning stable that we can
> resume tracking.

Hi,

I can't help identify the problem and if it was fixed without any
information. Can you at least let us know what kind of crashes are you
seeing? Kernel panics?SIGBUS? Something else?

It would be best if you could bisect to the revision causing you problems.

Note that despite the name, STABLE is a development branch and users
of the branch are expected to be able to provide some help tracking
down issues.

-- 
Eitan Adler
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Stability of 11.1S

2018-03-19 Thread David Wolfskill
On Tue, Mar 20, 2018 at 12:05:33PM +1100, Dewayne Geraghty wrote:
> We rebuild 11.1-Stable at least every two weeks.  Our build on the 7th
> Feb is in use on our development boxes, however the rebuild on 22nd
> resulted in frequent crashes and our reverting to FreeBSD 11.1-STABLE 
> r329008.  Is anyone actually running a Stable that was built after 22nd
> Feb?  Could you please share the revision number?

These are lightly loaded, but the two "production" boxes at home run
a stable/11 snapshot, built weekly.

Details on the process may be found at
; the page with
historical information (including "uname" output for each snapshot run)
is .  (Each of the
two machines runs from the same sources as listed for "albert".)

(My laptop & build machine run a daily snapshot of stable/11, as well as
building & smoke-testing a daily snapshot of head.  The above-cited
"weekly snapshot" is actually a "daily snapshot" that is sampled only
weekly -- on Sunday morning.)

> ... 

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
An investigator who doesn't make a perp nervous isn't doing his job.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


signature.asc
Description: PGP signature


Stability of 11.1S

2018-03-19 Thread Dewayne Geraghty
We rebuild 11.1-Stable at least every two weeks.  Our build on the 7th
Feb is in use on our development boxes, however the rebuild on 22nd
resulted in frequent crashes and our reverting to FreeBSD 11.1-STABLE 
r329008.  Is anyone actually running a Stable that was built after 22nd
Feb?  Could you please share the revision number?

Because the churn in
https://lists.freebsd.org/pipermail/svn-src-stable-11/2018-March/ is
high we haven't been able to sight if a problem was identified and
fixed; so we're really looking for a functioning stable that we can
resume tracking.

PS There was no information in the logs, and we have no instrumentation
in the kernel to help. Sorry.

-- 

Influence national support against IP address spoofing (pretending to be 
someone else), refer: http://www.bcp38.info/index.php/Main_Page

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: FreeBSD 9.2-RELEASE stability?

2013-10-01 Thread Thomas Mueller
 I have just upgraded two virtual machines running on ESXi. They are i386
 with 256Mb of RAM and one CPU, with just a few ports installed (sudo and
 screen and dependencies). They don't do much job (low-traffic authoritative
 nameservers for a dozen of domains). I upgraded by freebsd-update. I don't
 see any problems so far. Also my laptop is 9-STABLE amd64 (currently at
 r255867) and I do not have any more problems than usual (the unfortunate
 AR9285 wifi adapter).

 Marko Cupać

I've been looking for Atheros support in FreeBSD, especially HEAD.

In my case it's an MSI motherboard, Z77 MPOWER, with onboard Ethernet chip 
Realtek 8111E and wifi chip Atheros AR9271.

It looks like AR9271 and AR9285 are supported in NetBSD-current, you can view 
NetBSD man pages online.

I intend to try, not to abandon FreeBSD: update to 9.2-RELEASE on older MSI 
motherboard and build FreeBSD-current for the Z77 MPOWER.



Tom

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: FreeBSD 9.2-RELEASE stability?

2013-10-01 Thread Ronald Klop

On Mon, 30 Sep 2013 21:01:26 +0200, Brett Glass br...@lariat.net wrote:

How stable are folks finding FreeBSD 9.2-RELEASE to be? The improvements  
are welcome, but there have been a few troubling messages about kernel  
panics and VM issues on the various mailing lists. It's never clear  
until the release drops whether these are actual problems with the  
software or hardware defects in individual systems, so I am eager to  
hear how the new release is working for everyone.


--Brett Glass


I agree that on the mailinglist it looks like this happens:

1. X.Y-RELEASE
2. bugfixes on X.Y-STABLE
3. half way between 2 releases X.Y-STABLE looks pretty good
4. announcement code freeze X.(Y+1)-RELEASE is coming
5. MFC all kinds of new features from -HEAD to -STABLE
6. A lot of mails about bugs and also fixes
7. X.(Y+1)-RELEASE
8. bugfixes on X.(Y+1)-STABLE
9. half way between 2 releases X.(Y+1)-STABLE is pretty good

But in the end a mailinglist is a collection of problem reports and not a  
collection of success stories. For a lot of people it runs very well and  
you never here them.

So I guess it all runs pretty well unless your system does not.

Cheers,
Ronald.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


FreeBSD 9.2-RELEASE stability?

2013-09-30 Thread Brett Glass
How stable are folks finding FreeBSD 9.2-RELEASE to be? The 
improvements are welcome, but there have been a few troubling 
messages about kernel panics and VM issues on the various mailing 
lists. It's never clear until the release drops whether these are 
actual problems with the software or hardware defects in individual 
systems, so I am eager to hear how the new release is working for everyone.


--Brett Glass

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.2-RELEASE stability?

2013-09-30 Thread Marko Cupać
On Mon, 30 Sep 2013 13:01:26 -0600
Brett Glass br...@lariat.net wrote:

 How stable are folks finding FreeBSD 9.2-RELEASE to be? The 
 improvements are welcome, but there have been a few troubling 
 messages about kernel panics and VM issues on the various mailing 
 lists. It's never clear until the release drops whether these are 
 actual problems with the software or hardware defects in individual 
 systems, so I am eager to hear how the new release is working for everyone.

I have just upgraded two virtual machines running on ESXi. They are i386
with 256Mb of RAM and one CPU, with just a few ports installed (sudo and
screen and dependencies). They don't do much job (low-traffic authoritative
nameservers for a dozen of domains). I upgraded by freebsd-update. I don't
see any problems so far. Also my laptop is 9-STABLE amd64 (currently at
r255867) and I do not have any more problems than usual (the unfortunate
AR9285 wifi adapter).
-- 
Marko Cupać
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: FreeBSD 9.2-RELEASE stability?

2013-09-30 Thread Patrick Lamaiziere
Le Mon, 30 Sep 2013 13:01:26 -0600,
Brett Glass br...@lariat.net a écrit :

Hello,

 How stable are folks finding FreeBSD 9.2-RELEASE to be? The 
 improvements are welcome, but there have been a few troubling 
 messages about kernel panics and VM issues on the various mailing 
 lists. It's never clear until the release drops whether these are 
 actual problems with the software or hardware defects in individual 
 systems, so I am eager to hear how the new release is working for
 everyone.

I've seen two problems if you use poudriere (on ZFS only?) which
occur in some loads (ie desktop running gvfsd). One fix is in 9-STABLE
and the other one should be mfced soon.

May be there will be an errata for 9.2-RELEASE for
this ? I think that would be nice because 9.2 is stable as a
Windows 3.11 with my load :-)

Regards.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.1 stability/robustness?

2012-11-06 Thread Rick Miller
Someone asked about getting The DL360 G8 to boot to their intel
branded pci NIC, but I cannot find that email :(

At any rate, we removed the Broadcom daughterboard from the system and
insured that the intel PCI NIC (i350) was bootable in the BIOS.

On Sat, Nov 3, 2012 at 9:56 PM, Rick Miller vmil...@hostileadmin.com wrote:

 I have a blog post at
 http://blog.hostileadmin.com/2012/06/14/freebsd-on-hp-proliant-dl360p-g8-servers/
 which touches on this.  I heard as recently as today that the fixes
 for the BCM5719 and 5720 were recently committed to -CURRENT.  It's
 too late for them to be rolled into 9.1.  Not sure if they'll be
 committed to to stable/8 or not, but if so they could make it into
 8.4-R.

-- 
Take care
Rick Miller
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.1 stability/robustness?

2012-11-04 Thread Rainer Duffner
Am Sat, 3 Nov 2012 21:56:45 -0400
schrieb Rick Miller vmil...@hostileadmin.com:

 On Fri, Nov 2, 2012 at 4:10 AM, Rainer Duffner
 rai...@ultra-secure.de wrote:
  Am Thu, 1 Nov 2012 20:14:51 -0600 (MDT)
  schrieb Brett Glass br...@lariat.net:
 
  I need to build up a few servers and routers, and am wondering how
  FreeBSD 9.1 is shaping up. Will it be likely to be more stable and
  robust than 9.0-RELEASE? Are there issues that will have to wait
  until 9.2-RELEASE to be fixed? Opinions welcome.
 
 
  If I'm not mistaken, the bge-stuff that makes the default NICs ins
  HP G8 servers (360+380) actually run will not make it back into 9.1.
  Intel cards work much better anyway...
 
 I have a blog post at
 http://blog.hostileadmin.com/2012/06/14/freebsd-on-hp-proliant-dl360p-g8-servers/
 which touches on this. 

It comes up invariably once you google for FreeBSD DL 380 G8...

 I heard as recently as today that the fixes
 for the BCM5719 and 5720 were recently committed to -CURRENT.  It's
 too late for them to be rolled into 9.1.  Not sure if they'll be
 committed to to stable/8 or not, but if so they could make it into
 8.4-R.

Oh - will were be an 8.4 release? That would be interesting.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.1 stability/robustness?

2012-11-04 Thread Brett Glass

At 10:28 AM 11/4/2012, Rainer Duffner wrote:


Oh - will were be an 8.4 release? That would be interesting.


I'd like to see a trend toward more point versions of FreeBSD. 
Particularly in 9.x, because it incorporates most of the items that 
have been on people's wish lists. 4.11 was one of the most robust 
and stable releases ever, and I used it for many years.


--Brett Glass


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.1 stability/robustness?

2012-11-04 Thread Kurt Jaeger
Hi!

 Oh - will were be an 8.4 release? That would be interesting.
 
 I'd like to see a trend toward more point versions of FreeBSD. 
 Particularly in 9.x, because it incorporates most of the items that 
 have been on people's wish lists. 4.11 was one of the most robust 
 and stable releases ever, and I used it for many years.

I still use 4.11 on two servers 8-}

-- 
p...@opsec.eu+49 171 3101372 8 years to go !
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.1 stability/robustness?

2012-11-04 Thread Rick Miller
On Sun, Nov 4, 2012 at 12:28 PM, Rainer Duffner rai...@ultra-secure.de wrote:

 Oh - will were be an 8.4 release? That would be interesting.

History shows that every release, since 4.x has gone to at least .4.
I'd be willing to bet we will see an 8.4.  The branch is still being
developed.

http://en.wikipedia.org/wiki/FreeBSD#Timeline


-- 
Take care
Rick Miller
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.1 stability/robustness?

2012-11-03 Thread Mark Saad


On Nov 2, 2012, at 3:15 PM, Rainer Duffner rai...@ultra-secure.de wrote:

 Am Fri, 2 Nov 2012 13:34:20 -0400
 schrieb Mark Saad nones...@longcount.org:
 
 
 
 
 On Nov 2, 2012, at 4:10 AM, Rainer Duffner rai...@ultra-secure.de
 wrote:
 
 Am Thu, 1 Nov 2012 20:14:51 -0600 (MDT)
 schrieb Brett Glass br...@lariat.net:
 
 I need to build up a few servers and routers, and am wondering how
 FreeBSD 9.1 is shaping up. Will it be likely to be more stable and
 robust than 9.0-RELEASE? Are there issues that will have to wait
 until 9.2-RELEASE to be fixed? Opinions welcome.
 
 
 If I'm not mistaken, the bge-stuff that makes the default NICs ins
 HP G8 servers (360+380) actually run will not make it back into 9.1.
 Intel cards work much better anyway...
 
 Did you swap out the bge nic daughter card in the g8 servers for an
 intel one or , do you mean in general the intel nic support is
 better ? 
 
 Both, actually.
 At least, Intel has drivers for FreeBSD on their website and IIRC, it's
 a Tier 1 OS for them.
 I don't want to dis the efforts of the people working on the bXe stuff,
 but from what I have read, they have much less support from the vendor.
 We have used HP servers even back when they were still Compaq-servers
 (and came with Intel NICs...) and this is really the first time we had
 to install Intel NICs with them (with FreeBSD - there was an earlier
 issue with Solaris, but that does not count...).
 
 Are there Intel daughter cards for this server?
 I thought, all the daugher-cards came with some sort of Broadcom
 chipset.

Hp did a presentation at work 2 weeks ago about the g8 . Hp said you can swap 
out a daughter card in the 360/380/580 for nic options like broadcom 4 port 
gigabit nic , melenox infinbabd, intel pro1000 4 port nic , qlogic 8Gb fc-al  
and others . They said its an FRU but I have not seen the parts yet . 

---
Mark saad | mark.s...@longcount.org

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.1 stability/robustness?

2012-11-03 Thread Rainer Duffner
Am Sat, 3 Nov 2012 11:06:26 -0400
schrieb Mark Saad nones...@longcount.org:


 Hp did a presentation at work 2 weeks ago about the g8 . Hp said you
 can swap out a daughter card in the 360/380/580 for nic options like
 broadcom 4 port gigabit nic , melenox infinbabd, intel pro1000 4 port
 nic , qlogic 8Gb fc-al  and others . 


I've heard that, too (was on holiday when the sales-guy was here)

 They said its an FRU but I have
 not seen the parts yet . 


The quickspecs make no mention of it:

http://h18000.www1.hp.com/products/quickspecs/14212_na/14212_na.html


Only that 331FLR adapter, with contains that beloved BCM-chip.

http://h18000.www1.hp.com/products/quickspecs/14214_div/14214_div.HTML

Or one of the 10G adapters.
But 10G is probably worse - and we don't have any 10G switch-ports
anyway

With the infiniband-stuff, they are probably waiting for the first
customer to order a couple of thousand so they can do a profitable
one-off production run...
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.1 stability/robustness?

2012-11-03 Thread Rick Miller
On Fri, Nov 2, 2012 at 4:10 AM, Rainer Duffner rai...@ultra-secure.de wrote:
 Am Thu, 1 Nov 2012 20:14:51 -0600 (MDT)
 schrieb Brett Glass br...@lariat.net:

 I need to build up a few servers and routers, and am wondering how
 FreeBSD 9.1 is shaping up. Will it be likely to be more stable and
 robust than 9.0-RELEASE? Are there issues that will have to wait
 until 9.2-RELEASE to be fixed? Opinions welcome.


 If I'm not mistaken, the bge-stuff that makes the default NICs ins HP
 G8 servers (360+380) actually run will not make it back into 9.1.
 Intel cards work much better anyway...

I have a blog post at
http://blog.hostileadmin.com/2012/06/14/freebsd-on-hp-proliant-dl360p-g8-servers/
which touches on this.  I heard as recently as today that the fixes
for the BCM5719 and 5720 were recently committed to -CURRENT.  It's
too late for them to be rolled into 9.1.  Not sure if they'll be
committed to to stable/8 or not, but if so they could make it into
8.4-R.

-- 
Take care
Rick Miller
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.1 stability/robustness?

2012-11-02 Thread Rainer Duffner
Am Thu, 1 Nov 2012 20:14:51 -0600 (MDT)
schrieb Brett Glass br...@lariat.net:

 I need to build up a few servers and routers, and am wondering how
 FreeBSD 9.1 is shaping up. Will it be likely to be more stable and
 robust than 9.0-RELEASE? Are there issues that will have to wait
 until 9.2-RELEASE to be fixed? Opinions welcome.


If I'm not mistaken, the bge-stuff that makes the default NICs ins HP
G8 servers (360+380) actually run will not make it back into 9.1.
Intel cards work much better anyway...

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.1 stability/robustness?

2012-11-02 Thread Thomas Mueller
On 1 November 2012, at 19:14, Brett Glass wrote:

 I need to build up a few servers and routers, and am wondering how
 FreeBSD 9.1 is shaping up. Will it be likely to be more stable and
 robust than 9.0-RELEASE?

Doug Hardie responded:

 It appears to be for me.  I had problems with 9.0 not reading CDs and 
 rebooting with no error messages frequently.  I have upgraded to 9.1-RC2 and 
 it now
 reads CDs just fine, and has not rebooted.  However, the uptimes with 9.0 
 ranged from about 2 hours to 30 days.  I have only had 9.1-RC2 running for a
 couple weeks so have not declared victory yet.  I has been running for more 
 than most of the uptimes already.

I too had problems with 9.0 spontaneously rebooting after a day or two uptime.  
One was a freeze after a cvs update of NetBSD pkgsrc.  The second time was a 
spontaneous reboot during a time of idleness; I was in the same room and heard 
the computer sounds.

No more such problem after I updated, building from source, to RELENG_9 
(STABLE).

I haven't updated yet to 9.1 prerelease, bogged down with ports-upgrading snags 
and cross-compiling NetBSD.

Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.1 stability/robustness?

2012-11-02 Thread George Mitchell

On 11/01/12 22:14, Brett Glass wrote:

I need to build up a few servers and routers, and am wondering how
FreeBSD 9.1 is shaping up. Will it be likely to be more stable and
robust than 9.0-RELEASE? Are there issues that will have to wait
until 9.2-RELEASE to be fixed? Opinions welcome.

--Brett Glass
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org



Personally I don't have a warm, fuzzy feeling about 9.x yet.  I'm
sticking with 8.3 with 4BSD scheduler for now.   -- George Mitchell
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 9.1 stability/robustness?

2012-11-02 Thread pulley
 I need to build up a few servers and routers, and am wondering how
 FreeBSD 9.1 is shaping up. Will it be likely to be more stable and
 robust than 9.0-RELEASE? Are there issues that will have to wait
 until 9.2-RELEASE to be fixed? Opinions welcome.

 --Brett Glass

I've got 9.1-RC2 running on 2 amd64 boxes both using ZFS root builds and 2
i386 platforms one desktop and one netbook.

The amd64 boxes are one file server (NFS/SMB) and one KDE4 photography
work-flow desktop/proofing station. Moving large files/video streaming
over lagg0 (LACP) gig-E lines no worries so far (knock knock). Both many
core large ram systems.

The 2 i386's are just screw-around desktops (web email etc..) but no
problems there either.

As for routers I always use OpenBSD there just out of old habit more than
any technical reason and 5.2 is working great there...for the past 22
hours at least :)

looking forward to RC3 or release

-- 
Eric S Pulley



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.1 stability/robustness?

2012-11-02 Thread pete wright
On Thu, Nov 1, 2012 at 7:14 PM, Brett Glass br...@lariat.net wrote:
 I need to build up a few servers and routers, and am wondering how
 FreeBSD 9.1 is shaping up. Will it be likely to be more stable and
 robust than 9.0-RELEASE? Are there issues that will have to wait
 until 9.2-RELEASE to be fixed? Opinions welcome.


Just another data point: running 9.1-RC1 as well as 9.1-RC2 since they
have become available on my primary workstation/build server (for
pkgng pkg's), in addition to my mail/web/shell servers.  I have
managed all my updates via freebsd-update, so no custom bits compiled
for the kernel or userland.  have had no lockups, and performance is
great on my workstation/build server.  On all systems I've been using
a combination of ufs and zfs w/o issues as well.

Hope this helps.
-pete



-- 
pete wright
www.nycbug.org
@nomadlogicLA
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.1 stability/robustness?

2012-11-02 Thread Mark Saad



On Nov 2, 2012, at 4:10 AM, Rainer Duffner rai...@ultra-secure.de wrote:

 Am Thu, 1 Nov 2012 20:14:51 -0600 (MDT)
 schrieb Brett Glass br...@lariat.net:
 
 I need to build up a few servers and routers, and am wondering how
 FreeBSD 9.1 is shaping up. Will it be likely to be more stable and
 robust than 9.0-RELEASE? Are there issues that will have to wait
 until 9.2-RELEASE to be fixed? Opinions welcome.
 
 
 If I'm not mistaken, the bge-stuff that makes the default NICs ins HP
 G8 servers (360+380) actually run will not make it back into 9.1.
 Intel cards work much better anyway...
 
Did you swap out the bge nic daughter card in the g8 servers for an intel one 
or , do you mean in general the intel nic support is better ? 


 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.1 stability/robustness?

2012-11-02 Thread Rainer Duffner
Am Fri, 2 Nov 2012 13:34:20 -0400
schrieb Mark Saad nones...@longcount.org:

 
 
 
 On Nov 2, 2012, at 4:10 AM, Rainer Duffner rai...@ultra-secure.de
 wrote:
 
  Am Thu, 1 Nov 2012 20:14:51 -0600 (MDT)
  schrieb Brett Glass br...@lariat.net:
  
  I need to build up a few servers and routers, and am wondering how
  FreeBSD 9.1 is shaping up. Will it be likely to be more stable and
  robust than 9.0-RELEASE? Are there issues that will have to wait
  until 9.2-RELEASE to be fixed? Opinions welcome.
  
  
  If I'm not mistaken, the bge-stuff that makes the default NICs ins
  HP G8 servers (360+380) actually run will not make it back into 9.1.
  Intel cards work much better anyway...
  
 Did you swap out the bge nic daughter card in the g8 servers for an
 intel one or , do you mean in general the intel nic support is
 better ? 

Both, actually.
At least, Intel has drivers for FreeBSD on their website and IIRC, it's
a Tier 1 OS for them.
I don't want to dis the efforts of the people working on the bXe stuff,
but from what I have read, they have much less support from the vendor.
We have used HP servers even back when they were still Compaq-servers
(and came with Intel NICs...) and this is really the first time we had
to install Intel NICs with them (with FreeBSD - there was an earlier
issue with Solaris, but that does not count...).

Are there Intel daughter cards for this server?
I thought, all the daugher-cards came with some sort of Broadcom
chipset.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.1 stability/robustness?

2012-11-02 Thread Shane Ambler

On 02/11/2012 15:57, Doug Hardie wrote:


On 1 November 2012, at 19:14, Brett Glass wrote:


I need to build up a few servers and routers, and am wondering how
FreeBSD 9.1 is shaping up. Will it be likely to be more stable and
robust than 9.0-RELEASE?


It appears to be for me.  I had problems with 9.0 not reading CDs and
rebooting with no error messages frequently.  I have upgraded to
9.1-RC2 and it now reads CDs just fine, and has not rebooted.
However, the uptimes with 9.0 ranged from about 2 hours to 30 days.
I have only had 9.1-RC2 running for a couple weeks so have not
declared victory yet.  I has been running for more than most of the
uptimes already.



Personally I have had little issue with 9.0. I started with installing 
PC-BSD-9.0RC3 then moved to FreeBSD 9.0-RELEASE


Shortly after I installed a world built with clang which found an issue 
with libthr that is fixed in 9.1


Until yesterday my only restarts have been power failure or updating 
kernel and/or kmods - I seem to have trouble manually unloading the 
nvidia kmod so end up restarting. I am fairly certain the restart I had 
yesterday is related to cuse4bsd-kmod which I have disabled for now to 
try and prove that. While I can load and use the current version the 
previous one is the only one I have been able to have activated during 
startup.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


9.1 stability/robustness?

2012-11-01 Thread Brett Glass
I need to build up a few servers and routers, and am wondering how
FreeBSD 9.1 is shaping up. Will it be likely to be more stable and
robust than 9.0-RELEASE? Are there issues that will have to wait
until 9.2-RELEASE to be fixed? Opinions welcome.

--Brett Glass
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


FreeBSD 9.1 stability/robustness?

2012-11-01 Thread Brett Glass
I need to build up a few servers and routers, and am wondering how
FreeBSD 9.1 is shaping up. Will it be likely to be more stable and
robust than 9.0-RELEASE? Are there issues that will have to wait
until 9.2-RELEASE to be fixed? Opinions welcome.

--Brett Glass
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.1 stability/robustness?

2012-11-01 Thread Doug Hardie

On 1 November 2012, at 19:14, Brett Glass wrote:

 I need to build up a few servers and routers, and am wondering how
 FreeBSD 9.1 is shaping up. Will it be likely to be more stable and
 robust than 9.0-RELEASE?

It appears to be for me.  I had problems with 9.0 not reading CDs and rebooting 
with no error messages frequently.  I have upgraded to 9.1-RC2 and it now reads 
CDs just fine, and has not rebooted.  However, the uptimes with 9.0 ranged from 
about 2 hours to 30 days.  I have only had 9.1-RC2 running for a couple weeks 
so have not declared victory yet.  I has been running for more than most of the 
uptimes already.


 Are there issues that will have to wait
 until 9.2-RELEASE to be fixed? Opinions welcome.

I have no information on this.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


About zfs + nfs stability

2010-08-31 Thread Giulio Ferro

I have a 8.0 stable (last build around April 2010) which I use as a nfs
server : amd64, 8GB RAM, ~7TB storage.

I had a lot of grief with the (sadly) well know kmem map too small 
bug, which really compromised the quality of the service that server

was deputed to.

There wasn't (and there still isn't) any relevant indication in the
official freebsd zfs documentation on how to bypass the problem.
Only thanks to the effort and goodwill of other users in this list
and with hours on end of trying,
I could come up with something working:
(in /boot/loader.conf)
vm.kmem_size=6096M
vfs.zfs.arc_max=3584M
vfs.zfs.prefetch_disable=1
vfs.zfs.txg.timeout=5

The freezes are gone, thankfully, but I often get huge slow-downs: 
looking in the logs of the nfs clients I get plenty of:

... kernel: nfs server ...:/path/to/dir: lockd not responding
... kernel: nfs server ...:/path/to/dir: lockd is alive again

I don't know if this has anything to do with zfs.
What I'd like to know is the answer to the following questions
by other users and/or developers.

I don't need opinions, only punctual facts people have verified for
themselves.

1) Is it a good idea to upgrade this production system to the latest 8 
stable (8.1 stable I believe)? Is it really stable?

2) Are the zfs aforementioned tuning in /boot/loader.conf still necessary?
3) Is it a good idea to switch to nfsv4? Performance? Stability?

and above all:

4) will I get a more stable and performant system by upgrading?

Thanks in advance for the answers...
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: About zfs + nfs stability

2010-08-31 Thread jhell
On 08/31/2010 09:21, Giulio Ferro wrote:
 1) Is it a good idea to upgrade this production system to the latest 8
 stable (8.1 stable I believe)? Is it really stable?

For this question alone, I can verify that it is stable to upgrade to
the stable branch. Though on one hand it might be reasonable for you to
locally merge changes from the two points of CDDL into your source tree.

Example: (Tested here)
cd /usr/src
svn merge svn://svn.freebsd.org/base/stable/8/cddl cddl
svn merge svn://svn.freebsd.org/base/stable/8/sys/cddl sys/cddl

If you do not have any local changes to your source tree for those parts
of the branch then you should not have any problems or conflicts upon
merge  this will bring your system up-to-date with ZFSv14 in stable/8.

Another route if you use CVS would be to checkout the source tree using
Subversion and diff it locally but you should still end up with the same
result.

There are a few patches that I can recommend but they are for stable/8
that has been patched with ZFSv15 that is due to be committed some time
in September - November. Patches and descriptions below. And attached is
a UMA patch for the VFS subsystem that helps a little with performance
but not near as much as the patches below.

http://people.freebsd.org/~mm/patches/zfs/v15/stable-8-v15.patch
http://people.freebsd.org/~mm/patches/zfs/zfs_metaslab.patch
http://people.freebsd.org/~mm/patches/zfs/zfs_abe_stat_rrwlock.patch

And for the better performance question by upgrading... that is a real
hard question to answer not knowing your hardware implementation. There
really has not been that much of a performance increase that I can
account for regarding stable/8 vs. releng/8.1, or at least not yet.

PS: I have done minimal testing for V4: /nfs either my understanding of
it or the way it is setup to work is somewhat confusing but this is only
with very little knowledge of NFSv4 so please only take this as opinion
but I would not upgrade a production system to use V4: /nfs quite yet
unless the need demands it.


Regards  Good Luck,

-- 

 jhell,v
diff --git a/sys/vm/uma_core.c b/sys/vm/uma_core.c
index 2dcd14f..ed07ecb 100644
--- a/sys/vm/uma_core.c
+++ b/sys/vm/uma_core.c
@@ -2727,14 +2727,26 @@ zone_free_item(uma_zone_t zone, void *item, void *udata,
}
MPASS(keg == slab-us_keg);

-   /* Do we need to remove from any lists? */
+   /* Move to the appropriate list or re-queue further from the head. */
if (slab-us_freecount+1 == keg-uk_ipers) {
+   /* Partial - free. */
LIST_REMOVE(slab, us_link);
LIST_INSERT_HEAD(keg-uk_free_slab, slab, us_link);
} else if (slab-us_freecount == 0) {
+   /* Full - partial. */
LIST_REMOVE(slab, us_link);
LIST_INSERT_HEAD(keg-uk_part_slab, slab, us_link);
}
+   else {
+   /* Partial - partial. */
+   uma_slab_t tmp;
+
+   tmp = LIST_NEXT(slab, us_link);
+   if (tmp != NULL  slab-us_freecount  tmp-us_freecount) {
+   LIST_REMOVE(slab, us_link);
+   LIST_INSERT_AFTER(tmp, slab, us_link);
+   }
+   }

/* Slab management stuff */
freei = ((unsigned long)item - (unsigned long)slab-us_data)
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: About zfs + nfs stability

2010-08-31 Thread Rick Macklem
 
 The freezes are gone, thankfully, but I often get huge slow-downs:
 looking in the logs of the nfs clients I get plenty of:
 ... kernel: nfs server ...:/path/to/dir: lockd not responding
 ... kernel: nfs server ...:/path/to/dir: lockd is alive again
 

If you don't need file locking to work across multiple clients
concurrently (ie. multiple clients aren't locking the same file
at the same time), then you can avoid the NLM by using the
nolockd mount option on the clients. (Linux has a similar
mount option under a different name.)

 I don't know if this has anything to do with zfs.

I don't believe it has anything to do with zfs. The NLM is a
separate protocol from NFS.

 3) Is it a good idea to switch to nfsv4? Performance? Stability?
 

NFSv4 will provide better file locking (if you need that) imho, but
is still considered experimental, so it is hard to say how well
it will work for you. Some seem to use it without difficulties, whereas
others have problems. There is a recent unresolved thread where a
guy has perf. problems on some of his clients, but not all.

rick

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: wpi0 stability and acpi_hp problems

2010-03-20 Thread Dominic Fandrey
On 19/03/2010 12:35, Dominic Fandrey wrote:
 ...
 
 My wpi problems are more severe.
 
 I recently purchased a new battery for my notebook and to improve
 my battery uptime I deactivated the bluetooth device in the BIOS
 (HP6510b).
 
 Ever since the wlan connection is less reliable. ...

Even with the bluetooth device reactivated the thing panics on me,
not exclusively, but extremely frequently, when using wpa_supplicant.

I fed this stuff into a PR:
http://www.freebsd.org/cgi/query-pr.cgi?pr=144898

-- 
A: Because it fouls the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail? 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


wpi0 stability and acpi_hp problems

2010-03-19 Thread Dominic Fandrey
I'm running RELENG_8 (built yesterday) and have encountered
problems with wpi and acpi_hp.

The thing about acpi_hp is that it misses most of the hardware
when activated in the loader.conf. The WLAN, BT and other sysctls
are only available if I load the module after boot.

My wpi problems are more severe.

I recently purchased a new battery for my notebook and to improve
my battery uptime I deactivated the bluetooth device in the BIOS
(HP6510b).

Ever since the wlan connection is less reliable. The device tends
to spontaneously turn itself off. I also occasionally see:
wpi0: could not configure bluetooth coexistence

The last time this happended (~2 hours ago) I went to the first
console to witness the dmesg events. I pressed the WLAN switch,
the dmesg showed a radion on message, then the system paniced.

Unfortunately it didn't create a dump, probably because I've got
8gb RAM and only 4gb of swap space, though with the new minidumps
that shouldn't really be a problem, I think.

Anyway, I at least copied the screen output on a sheet of paper
and will provide it tonight (I'm on a train at the moment, typing
it now is too inconvenient).

Regards

-- 
A: Because it fouls the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail? 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Stability problems with 7-stable (after 7.1 - 7.2 - 7-stable)

2009-12-18 Thread Alexander Leidinger

Quoting Boris Samorodov b...@ipt.ru (from Thu, 17 Dec 2009 20:55:44 +0300):


Ivan Voras ivo...@freebsd.org writes:

Alexander Leidinger wrote:

Hi,

please CC me on replies.


Seems you were not CCed...


I'm now subscribed to stable@, thanks for forwarding this.


I have a system which was at 7.1-pX. After the update to 7.2-p5 it
started to exhibit deadlocks after some minutes of uptime.

With 7.1 (generic kernel) it was running fine, with 7.2 generic the
problems started directly.

The system is now at 7-stable with a custom kernel
(http://www.Leidinger.net/test/ALCATRAZ), basically generic without
unneeded drivers plus witness/invariants/sw-watchdog.

The system is an AMD Dual Core with NVidia MCP61 chipset
(http://www.Leidinger.net/test/dmesg.alcatraz), 2 GB RAM, 2
harddisks and FreeBSD 32bit install.


Some generic things to try:
- did you monitor the system with something (top or systat
-vm) to see if there is something unusual, like interrupt storms?


When I had the initial problems, I asked for a KVM-switch to be  
connected to the system (not a free service). In SU mode I didn't see  
any problem. When starting the system but not the jails, I didn't see  
any problem (cvsup/buildworld/...). When I started the jails, I  
started to see the problems.



- no physical access is a problem; If you do manage it, I'd
say try running single user for some time with systat -vm just to see
what happens.


This is not an option now.


I would not trust ZFS in 7-stable since it lags a bit behind patches
done to 8 but 7.2 should be fine - at least I don't have any such
problems with it (though no AMD boxes to test them with it).


Ivan, the system started out to be without ZFS, just after I started  
to see deadlocks I switched to ZFS. This _improved_ the situation. Now  
the system survives between 3h and about 11h without a deadlock. If I  
run every 5 minutes a script which logs 4 text lines to the root (UFS)  
and runs 3x sync + sleep 5 + 3x sync the frequency of deadlocks  
increases.



If you haven't updated your ZFS pools, I'd suggest reverting back to
7.1, then building or downloading an 8.0 kernel and try it with 7.1
userland (reboot -k ...) simply to see if it helps.


IIRC there where KBI changes (ifconfig?) which prevents me to go back  
to 7.1 without access to the console. As this is a production machine  
(it hosts not only my blog/website/mails, but stuff from other persons  
too), the goal is to stabilize this system now.


Kib analyzed 2 crashdumps I had (watchdog triggered) and he thinks  
they are because of ZFS deadlocks. So the initial problem (without  
ZFS) is not know yet, but this info will hopefully allow to stabilize  
the system further (see also my mail about at least 57 unmerged ZFS  
patches).


Bye,
Alexander.

--
Universities are places of knowledge.  The freshman each bring a little
in with them, and the seniors take none away, so knowledge accumulates.

http://www.Leidinger.netAlexander @ Leidinger.net: PGP ID = B0063FE7
http://www.FreeBSD.org   netchild @ FreeBSD.org  : PGP ID = 72077137
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Stability problems with 7-stable (after 7.1 - 7.2 - 7-stable)

2009-12-15 Thread Alexander Leidinger

Hi,

please CC me on replies.

I have a system which was at 7.1-pX. After the update to 7.2-p5 it  
started to exhibit deadlocks after some minutes of uptime.


With 7.1 (generic kernel) it was running fine, with 7.2 generic the  
problems started directly.


The system is now at 7-stable with a custom kernel  
(http://www.Leidinger.net/test/ALCATRAZ), basically generic without  
unneeded drivers plus witness/invariants/sw-watchdog.


The system is an AMD Dual Core with NVidia MCP61 chipset  
(http://www.Leidinger.net/test/dmesg.alcatraz), 2 GB RAM, 2 harddisks  
and FreeBSD 32bit install.


On the system are 3 jails (one postfix+mysql+apache, one  
mysql+apache+some-perl-service, one apache+mysql+xmpp-server). All of  
them have a 7-stable world.


The 2 disks where configured with 3 partition pairs for root-mirror,  
swap-mirror, and jail-mirror.


I tested with and without SMP, both schedulers, with  
WITNESS/INVARIANTS, and by removing one part of each mirror (to rule  
out that the disks are not in sync). In all cases the system was not  
stable and deadlocked after several minutes (even with only the  
mail-jail up and running). First no interaction via ssh is possible  
anymore, then even ping does not work anymore. After configuring the  
watchdog, I got at least the system back online automatically... :(


After reading  
http://www.mail-archive.com/freebsd-stable@freebsd.org/msg96901.html I  
decided to switch the FS for the jails to ZFS (currently only on one  
harddisk, the other partition for it is still with UFS, but not  
mounted at all) as a test.


Now with a little bit of kernel tuning for ZFS  
(http://www.Leidinger.net/test/loader.conf.alcatraz) I was able to  
keep the system up for about 3h with all jails activated (I started  
one jail after another, with waiting 1h between starting each jail).  
After that no access via ssh, no ping, but also no reboot from the  
sw-watchdog, I had to do a remote power-off/-on. After that I didn't  
had any crashdump (in the watchdog cases I had dumps, but since I  
recompiled the kernel since then, I can not provide useful output).


The current gmirror status output is at
   http://www.Leidinger.net/test/gmirror.alcatraz

The system has no serial console. I have no physical access.

For such a small setup I would expect that 7.2-GENERIC is more than  
enough. At least 7.1-GENERIC was running without any problem.


Does this problem sound familiar to someone, any ideas what to try,  
anyone with patches I could test?


Bye,
Alexander.

--
I'm not a real movie star -- I've still got the same wife I started out
with twenty-eight years ago.
-- Will Rogers

http://www.Leidinger.netAlexander @ Leidinger.net: PGP ID = B0063FE7
http://www.FreeBSD.org   netchild @ FreeBSD.org  : PGP ID = 72077137
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Stability problems with 7-stable (after 7.1 - 7.2 - 7-stable)

2009-12-15 Thread Ivan Voras

Alexander Leidinger wrote:

Hi,

please CC me on replies.

I have a system which was at 7.1-pX. After the update to 7.2-p5 it 
started to exhibit deadlocks after some minutes of uptime.


With 7.1 (generic kernel) it was running fine, with 7.2 generic the 
problems started directly.


The system is now at 7-stable with a custom kernel 
(http://www.Leidinger.net/test/ALCATRAZ), basically generic without 
unneeded drivers plus witness/invariants/sw-watchdog.


The system is an AMD Dual Core with NVidia MCP61 chipset 
(http://www.Leidinger.net/test/dmesg.alcatraz), 2 GB RAM, 2 harddisks 
and FreeBSD 32bit install.


Some generic things to try:
	- did you monitor the system with something (top or systat -vm) to see 
if there is something unusual, like interrupt storms?
	- no physical access is a problem; If you do manage it, I'd say try 
running single user for some time with systat -vm just to see what happens.


I would not trust ZFS in 7-stable since it lags a bit behind patches 
done to 8 but 7.2 should be fine - at least I don't have any such 
problems with it (though no AMD boxes to test them with it).


If you haven't updated your ZFS pools, I'd suggest reverting back to 
7.1, then building or downloading an 8.0 kernel and try it with 7.1 
userland (reboot -k ...) simply to see if it helps.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Multiple USB drives stability question

2009-08-15 Thread Jeff Richards
I am now trying to rsync large files from the 320GB gmirror+gjournal device to 
the 2nd 1TB gmirror+gjournal device.  Using gstat I see the 320GB device active 
all the time while the 1TB device loads in spurts.  There will be periods of 
multiple seconds where the target providers are completely idle while the 
source providers are still reporting 100% active.

Is there any tuning I should be investigating for these GEOM classes?

--- On Fri, 8/14/09, Jeff Richards bsd2...@yahoo.com wrote:

From: Jeff Richards bsd2...@yahoo.com
Subject: Re: Multiple USB drives stability question
To: freebsd-stable@freebsd.org
Date: Friday, August 14, 2009, 11:04 PM

I just tested my 2nd 1TB gmirror device on another system with FBSD 7.2.  I was 
getting full throughput on the drive and no lockup using bonnie++ and also 
monitoring with gstat.

I then moved those drives back on my main server.  When I booted the system I 
hung on the 320GB gmirror devices.  Previously the 1st 1TB gmirror and 320GB 
gmirror were attached to the integrated USB ports on the motherboard.  I moved 
the 320GB gmirror to a PCI USB adapter.

The 2 320GB drives in the gmirror were da5 and da6.  Here's what I saw on the 
console:

(da6:umass-sim6:6:0:0): SYNCHRONIZE CACHE(10). CDB: 35 0 0 0 0 0 0 0 0 0
(da6:umass-sim6:6:0:0): CAM Status: SCSI Status Error
(da6:umass-sim6:6:0:0): SCSI Status: Check Condition
(da6:umass-sim6:6:0:0): ILLEGAL REQUEST asc:20,0
(da6:umass-sim6:6:0:0): Invalid command operation mode
(da6:umass-sim6:6:0:0): Unretryable error
GEOM_MIRROR: Request failed (error=5), da6[READ(offset=512, length=512)]
GEOM_MIRROR: Device gm-san: provider da6 disconnected.
(da5:umass-sim5:5:0:0): SYNCHRONIZE CACHE(10). CDB: 35 0 0 0 0 0 0 0 0 0
(da5:umass-sim5:5:0:0): CAM Status: SCSI Status Error
(da5:umass-sim5:5:0:0): SCSI Status: Check Condition
(da5:umass-sim5:5:0:0): ILLEGAL REQUEST asc:20,0
(da5:umass-sim5:5:0:0): Invalid command operation mode
(da5:umass-sim5:5:0:0): Unretryable error
GEOM_JOURNAL: BIO_FLUSH not supported by mirror/gm-san.

I waited for a few minutes with no change in the console.  I then detached one 
of the USB drives (which happened to be da6) and saw this:

umass6: at uhub7 port 4 (addr 4) disconnected
(da6:umass-sim6:6:0:0): lost device

Nothing else changed for a few minutes so I powered off the system.  When I 
brought it back up the 320GB gmirror device was out of sync, but apart from 
that all devices were online.

Below are the kernel messages from the second boot:

Copyright (c) 1992-2009 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
    The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 7.2-RELEASE #0: Fri May  1 08:49:13 UTC 2009
    r...@walker.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC
Timecounter i8254 frequency 1193182 Hz quality 0
CPU: Intel(R) Celeron(R) CPU 2.26GHz (2266.67-MHz 686-class CPU)
  Origin = GenuineIntel  Id = 0xf49  Stepping = 9
  
Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
  Features2=0x441dSSE3,DTES64,MON,DS_CPL,CNXT-ID,xTPR
  AMD Features2=0x1LAHF
real memory  = 1877868544 (1790 MB)
avail memory = 1826934784 (1742 MB)
ACPI APIC Table: P4M900 AWRDACPI
ioapic0 Version 0.3 irqs 0-23 on motherboard
ioapic1 Version 0.3 irqs 24-47 on motherboard
kbd1 at kbdmux0
acpi0: P4M900 AWRDACPI on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
acpi0: reservation of 0, a (3) failed
acpi0: reservation of 10, 6fde (3) failed
Timecounter ACPI-fast frequency 3579545 Hz quality 1000
acpi_timer0: 24-bit timer at 3.579545MHz port 0x408-0x40b on acpi0
acpi_hpet0: High Precision Event Timer iomem 0xfe80-0xfe8003ff on acpi0
device_attach: acpi_hpet0 attach returned 12
acpi_button0: Power Button on acpi0
acpi_button1: Sleep Button on acpi0
pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0
pci0: ACPI PCI bus on pcib0
pcib1: PCI-PCI bridge at device 1.0 on pci0
pci1: PCI bus on pcib1
vgapci0: VGA-compatible display mem 
0xc000-0xcfff,0xfb00-0xfbff irq 16 at device 0.0 on pci1
pcib2: ACPI PCI-PCI bridge irq 27 at device 2.0 on pci0
pci2: ACPI PCI bus on pcib2
pcib3: ACPI PCI-PCI bridge irq 31 at device 3.0 on pci0
pci3: ACPI PCI bus on pcib3
atapci0: VIA 8237S SATA150 controller port 
0xfc00-0xfc07,0xf800-0xf803,0xf400-0xf407,0xf000-0xf003,0xec00-0xec0f,0xe800-0xe8ff
 irq 21 at device 15.0 on pci0
atapci0: [ITHREAD]
ata2: ATA channel 0 on atapci0
ata2: [ITHREAD]
ata3: ATA channel 1 on atapci0
ata3: [ITHREAD]
atapci1: VIA 8237S UDMA133 controller port 
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xe400-0xe40f at device 15.1 on pci0
ata0: ATA channel 0 on atapci1
ata0: [ITHREAD]
ata1: ATA channel 1 on atapci1
ata1: [ITHREAD]
uhci0: VIA 83C572 USB controller port 0xe000-0xe01f irq 20 at device 16.0 on 
pci0
uhci0: [GIANT-LOCKED]
uhci0: [ITHREAD]
usb0

Re: Multiple USB drives stability question

2009-08-15 Thread Jeff Richards
I was checking sysctl and noticed skipped_byes, alloc_failures,low_mem all 
increasing on geom.journal.

$sysctl -a | grep geom
kern.geom.collectstats: 1
kern.geom.debugflags: 0
kern.geom.label.debug: 0
kern.geom.mirror.sync_requests: 2
kern.geom.mirror.disconnect_on_failure: 1
kern.geom.mirror.idletime: 5
kern.geom.mirror.timeout: 4
kern.geom.mirror.debug: 0
kern.geom.journal.stats.low_mem: 380
kern.geom.journal.stats.journal_full: 0
kern.geom.journal.stats.wait_for_copy: 25
kern.geom.journal.stats.switches: 834
kern.geom.journal.stats.combined_ios: 5612
kern.geom.journal.stats.skipped_bytes: 34684928
kern.geom.journal.cache.alloc_failures: 14726
kern.geom.journal.cache.misses: 13894
kern.geom.journal.cache.switch: 90
kern.geom.journal.cache.divisor: 2
kern.geom.journal.cache.limit: 167772160
kern.geom.journal.cache.used: 79546368
kern.geom.journal.optimize: 1
kern.geom.journal.record_entries: 20
kern.geom.journal.parallel_copies: 16
kern.geom.journal.accept_immediately: 64
kern.geom.journal.parallel_flushes: 16
kern.geom.journal.force_switch: 70
kern.geom.journal.switch_time: 10
kern.geom.journal.debug: 0
kern.geom.virstor.component_watermark: 1
kern.geom.virstor.chunk_watermark: 100
kern.geom.virstor.debug: 2
debug.sizeof.g_geom: 68


$sysctl -a | grep geom
kern.geom.collectstats: 1
kern.geom.debugflags: 0
kern.geom.label.debug: 0
kern.geom.mirror.sync_requests: 2
kern.geom.mirror.disconnect_on_failure: 1
kern.geom.mirror.idletime: 5
kern.geom.mirror.timeout: 4
kern.geom.mirror.debug: 0
kern.geom.journal.stats.low_mem: 389
kern.geom.journal.stats.journal_full: 0
kern.geom.journal.stats.wait_for_copy: 28
kern.geom.journal.stats.switches: 838
kern.geom.journal.stats.combined_ios: 5622
kern.geom.journal.stats.skipped_bytes: 35667968
kern.geom.journal.cache.alloc_failures: 15016
kern.geom.journal.cache.misses: 15079
kern.geom.journal.cache.switch: 90
kern.geom.journal.cache.divisor: 2
kern.geom.journal.cache.limit: 167772160
kern.geom.journal.cache.used: 73140224
kern.geom.journal.optimize: 1
kern.geom.journal.record_entries: 20
kern.geom.journal.parallel_copies: 16
kern.geom.journal.accept_immediately: 64
kern.geom.journal.parallel_flushes: 16
kern.geom.journal.force_switch: 70
kern.geom.journal.switch_time: 10
kern.geom.journal.debug: 0
kern.geom.virstor.component_watermark: 1
kern.geom.virstor.chunk_watermark: 100
kern.geom.virstor.debug: 2
debug.sizeof.g_geom: 68

--- On Sat, 8/15/09, Jeff Richards bsd2...@yahoo.com wrote:

From: Jeff Richards bsd2...@yahoo.com
Subject: Re: Multiple USB drives stability question
To: freebsd-stable@freebsd.org
Date: Saturday, August 15, 2009, 10:50 AM

I am now trying to rsync large files from the 320GB gmirror+gjournal device to 
the 2nd 1TB gmirror+gjournal device.  Using gstat I see the 320GB device active 
all the time while the 1TB device loads in spurts.  There will be periods of 
multiple seconds where the target providers are completely idle while the 
source providers are still reporting 100% active.

Is there any tuning I should be investigating for these GEOM classes?

--- On Fri, 8/14/09, Jeff Richards bsd2...@yahoo.com wrote:

From: Jeff Richards bsd2...@yahoo.com
Subject: Re: Multiple USB drives stability question
To: freebsd-stable@freebsd.org
Date: Friday, August 14, 2009, 11:04 PM

I just tested my 2nd 1TB gmirror device on another system with FBSD 7.2.  I was 
getting full throughput on the drive and no lockup using bonnie++ and also 
monitoring with gstat.

I then moved those drives back on my main server.  When I booted the system I 
hung on the 320GB gmirror devices.  Previously the 1st 1TB gmirror and 320GB 
gmirror were attached to the integrated USB ports on the motherboard.  I moved 
the 320GB gmirror to a PCI USB adapter.

The 2 320GB drives in the gmirror were da5 and da6.  Here's what I saw on the 
console:

(da6:umass-sim6:6:0:0): SYNCHRONIZE CACHE(10). CDB: 35 0 0 0 0 0 0 0 0 0
(da6:umass-sim6:6:0:0): CAM Status: SCSI Status Error
(da6:umass-sim6:6:0:0): SCSI Status: Check Condition
(da6:umass-sim6:6:0:0): ILLEGAL REQUEST asc:20,0
(da6:umass-sim6:6:0:0): Invalid command operation mode
(da6:umass-sim6:6:0:0): Unretryable error
GEOM_MIRROR: Request failed (error=5), da6[READ(offset=512, length=512)]
GEOM_MIRROR: Device gm-san: provider da6 disconnected.
(da5:umass-sim5:5:0:0): SYNCHRONIZE CACHE(10). CDB: 35 0 0 0 0 0 0 0 0 0
(da5:umass-sim5:5:0:0): CAM Status: SCSI Status Error
(da5:umass-sim5:5:0:0): SCSI Status: Check Condition
(da5:umass-sim5:5:0:0): ILLEGAL REQUEST asc:20,0
(da5:umass-sim5:5:0:0): Invalid command operation mode
(da5:umass-sim5:5:0:0): Unretryable error
GEOM_JOURNAL: BIO_FLUSH not supported by mirror/gm-san.

I waited for a few minutes with no change in the console.  I then detached one 
of the USB drives (which happened to be da6) and saw this:

umass6: at uhub7 port 4 (addr 4) disconnected
(da6:umass-sim6:6:0:0): lost device

Nothing else changed for a few minutes so I powered off the system.  When I 
brought it back up

Re: Multiple USB drives stability question

2009-08-15 Thread Ben Stuyts

Jeff,

On 15 aug 2009, at 05:04, Jeff Richards wrote:

(da6:umass-sim6:6:0:0): SYNCHRONIZE CACHE(10). CDB: 35 0 0 0 0 0 0 0  
0 0

...

I've had lots of stability issues with USB drives until I added some  
quirks to prevent the SYNCHRONIZE CACHE from happening. For example:


Index: cam/scsi/scsi_da.c
===
RCS file: /usr/ncvs/src/sys/cam/scsi/scsi_da.c,v
retrieving revision 1.224.2.6
diff -r1.224.2.6 scsi_da.c
539a540,547
 * LaCie external 250GB Hard drive designed by Porsche
 * PR: usb/121474
 */
{T_DIRECT, SIP_MEDIA_FIXED, SAMSUNG, HM250JI, *},
/*quirks*/ DA_Q_NO_SYNC_CACHE
},
{
/*

You might try that, and see if it improves your situation.

Ben

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Multiple USB drives stability question

2009-08-14 Thread Jeff Richards
Is there a practical limit on the number of active USB drives with FreeBSD?  
I've had stability issues using multiple USB drives as storage.

My initial design goal was cheap, hot-swappable storage.  I am only using a 
100MB network currently so throughput on the storage is not a problem as I 
can't push the data to/from the drives faster than what my network requests 
are.  

I first tried my setup on 7.0, then migrated to a newer PC, then upgraded to 
7.2. 
 
I have the following USB drive setup:

1 320GB gmirror (320x2) + gjournal + ufs2
1 1TB gmirror (1TBx2) + gjournal + ufs2
1 150GB gjournal  + ufs2

I also have another 1TB gmirror (1TBx2) + gjournal but removed it.  The system 
crashed when I used these drives (bacula or bonnie++) so I pulled them to test 
on another system.

Recently my stability issue has been when I have been writing data to the 150GB 
gjournal drive from the 320GB gmirror device (USB device - USB device).  It 
will be working fine, then all I/O stops on the 150GB drive.  The system 
remains responding to other USB devices etc. for a while.  I try rebooting and 
the system crashes with gjournal errors (didn't write down, but I will later).  

Every time this happens the 1TB gmirror comes up fine but one of the 320GB 
providers is missing.  No problem after 'gmirror forget' and 'gmirror insert'.  
Everything rebuilds fine.  The 150GB gjournal drive is fine after a 'fsck -y'.

I do pair the gmirror drives to the same USB adapter.  Found out after initial 
testing with multiple USB adapters that they do not appear standard enough to 
cross adapters like I would for a production server at work to prevent SPOF 
with an adapter.

I have tried Linux as well with softraid and LVM2 on the same hardware.  It 
worked fine until I applied software updates and the udev took 30+ minutes to 
boot.  I went back to FreeBSD.  Even when I crashed I was back up in 2-5 
minutes.

I can and will provide more detail if requested.  My concern is that the issue 
seems to continue no matter what hardware/OS changes I try.

Thanks in advance.





___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Multiple USB drives stability question

2009-08-14 Thread Jeff Richards
 removed.
GEOM_LABEL: Label for provider ad0s1f is ufsid/4a42cfbd65525a3f.
GEOM_LABEL: Label ufsid/4a42cfbd75a68b18 removed.
GEOM_LABEL: Label for provider ad0s1g is ufsid/4a42cfbd75a68b18.
GEOM_LABEL: Label ufsid/4a42cfbdcada79a0 removed.
GEOM_LABEL: Label for provider ad0s2d is ufsid/4a42cfbdcada79a0.
GEOM_LABEL: Label ufsid/4a42cfc28b730061 removed.
GEOM_LABEL: Label for provider ad0s2e is ufsid/4a42cfc28b730061.
GEOM_LABEL: Label ufsid/4a42cfc21242e734 removed.
GEOM_LABEL: Label for provider ad0s1d is ufsid/4a42cfc21242e734.
GEOM_LABEL: Label ufsid/4a42cfc236be6f59 removed.
GEOM_LABEL: Label for provider ad0s2f is ufsid/4a42cfc236be6f59.
GEOM_LABEL: Label ufsid/4a42cfbde524d087 removed.
GEOM_LABEL: Label ufsid/4a42cfbdfcdf27b1 removed.
GEOM_LABEL: Label ufsid/4a42cfbd65525a3f removed.
GEOM_LABEL: Label ufsid/4a42cfbd75a68b18 removed.
GEOM_LABEL: Label ufsid/4a42cfbdcada79a0 removed.
GEOM_LABEL: Label ufsid/4a430e552079b936 removed.
GEOM_LABEL: Label ufsid/4a42cfc28b730061 removed.
GEOM_LABEL: Label ufsid/4a42cfc21242e734 removed.
GEOM_LABEL: Label ufsid/4a42cfc236be6f59 removed.
GEOM_LABEL: Label ufsid/4a3f26878cf7f367 removed.
GEOM_LABEL: Label ufsid/4a40c57f604c2e44 removed.
GEOM_LABEL: Label ufsid/49273a95d669d784 removed.
fuse4bsd: version 0.3.9-pre1, FUSE ABI 7.8
GEOM_LABEL: Label ufsid/4a509cddbd500a7e removed.


--- On Fri, 8/14/09, Jeff Richards bsd2...@yahoo.com wrote:

From: Jeff Richards bsd2...@yahoo.com
Subject: Multiple USB drives stability question
To: freebsd-stable@freebsd.org
Date: Friday, August 14, 2009, 8:19 PM

Is there a practical limit on the number of active USB drives with FreeBSD?  
I've had stability issues using multiple USB drives as storage.

My initial design goal was cheap, hot-swappable storage.  I am only using a 
100MB network currently so throughput on the storage is not a problem as I 
can't push the data to/from the drives faster than what my network requests 
are.  

I first tried my setup on 7.0, then migrated to a newer PC, then upgraded to 
7.2. 
 
I have the following USB drive setup:

1 320GB gmirror (320x2) + gjournal + ufs2
1 1TB gmirror (1TBx2) + gjournal + ufs2
1 150GB gjournal  + ufs2

I also have another 1TB gmirror (1TBx2) + gjournal but removed it.  The system 
crashed when I used these drives (bacula or bonnie++) so I pulled them to test 
on another system.

Recently my stability issue has been when I have been writing data to the 150GB 
gjournal drive from the 320GB gmirror device (USB device - USB device).  It 
will be working fine, then all I/O stops on the 150GB drive.  The system 
remains responding to other USB devices etc. for a while.  I try rebooting and 
the system crashes with gjournal errors (didn't write down, but I will later).  

Every time this happens the 1TB gmirror comes up fine but one of the 320GB 
providers is missing.  No problem after 'gmirror forget' and 'gmirror insert'.  
Everything rebuilds fine.  The 150GB gjournal drive is fine after a 'fsck -y'.

I do pair the gmirror drives to the same USB adapter.  Found out after initial 
testing with multiple USB adapters that they do not appear standard enough to 
cross adapters like I would for a production server at work to prevent SPOF 
with an adapter.

I have tried Linux as well with softraid and LVM2 on the same hardware.  It 
worked fine until I applied software updates and the udev took 30+ minutes to 
boot.  I went back to FreeBSD.  Even when I crashed I was back up in 2-5 
minutes.

I can and will provide more detail if requested.  My concern is that the issue 
seems to continue no matter what hardware/OS changes I try.

Thanks in advance.





___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: [ATA] and re(4) stability issues

2008-12-15 Thread Victor Balada Diaz
On Fri, Dec 12, 2008 at 01:13:09PM +0100, Victor Balada Diaz wrote:
 On Thu, Dec 11, 2008 at 10:50:21AM +0100, Victor Balada Diaz wrote:
  On Thu, Dec 11, 2008 at 06:00:56PM +0900, Pyun YongHyeon wrote:
   
   I've reverted r185756 which caused GMII access issues on some
   controllers. If you are brave enough to try beta code, you can
   get latest re(4) in the following URL. Note, I don't have PCIe
   based RealTek controllers so the code was not tested at all.
   
   http://people.freebsd.org/~yongari/re/if_re.c
   http://people.freebsd.org/~yongari/re/if_rlreg.h
  
  I've recompiled the kernel with the first file in sys/dev/re/
  and the second one in sys/pci/. I'm still testing with MSI enabled.
  
  So far tried rebooting using nextboot(8) (just in case i lost the
  network card i could boot again) and the card seems to work
  but i'll continue stress testing the machine with stress + dd +
  iperf and see if i can take it down. I'll let you know how it goes.
 
 After a day of stress testing the machine haven't got errors, interrupt
 storms or interface up/down problems. Everything seems fine.
 I'll continue stress testing the machine during the weekend, but
 i would say that this time it's fixed.

Stopped stress testing this morning. After all the weekend testing
seems the re(4) problems were fixed. No single interface up/down error.
netstat -i reports no errors and everything is fine. Thanks a lot!

I'm going to deploy the patches on our production machines.

I've been able to trigger interrupt storms with ATA code, though.

-- 
La prueba más fehaciente de que existe vida inteligente en otros
planetas, es que no han intentado contactar con nosotros. 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: [ATA] and re(4) stability issues

2008-12-15 Thread Victor Balada Diaz
On Wed, Dec 10, 2008 at 10:55:35AM +0100, Søren Schmidt wrote:
 On 10Dec, 2008, at 10:11 , Victor Balada Diaz wrote:
 
 Thanks for explaining me what the flags do. I'm not skilled enough  
 to create
 the DMA quirks but if you could give me some patches i'll test them.  
 Also
 if you have any other idea on what could i test or how can i debug  
 this
 it would be more than welcome.
 
 
 Comment out the following two lines in ata_ahci_dmainit():
 
 if (ATA_INL(ctlr-r_res2, ATA_AHCI_CAP)  ATA_AHCI_CAP_64BIT)
 ch-dma-max_address = BUS_SPACE_MAXADDR;
 
 And you will not use 64bit DMA even if the chipset supports it.  
 However I have not seen any chipsets supporting this fail, YMMV as  
 usual :)
 
Hello Søren,

I'm triggering interrupt storms with this chipset after a few
days of stressing the HD calling sysutils/stress with stress -d 10 -i 10
and in other term, doing: 

 while true; do dd if=/dev/zero of=BAH bs=1M count=1024; done;

Right now, as reported by systat -vmstat i have 578k interrupts in atapci
and the machine is idle. Do you have any idea on how could i debug
this? any advice would be much more than welcome.

Thanks a lot.
Regards.

-- 
La prueba más fehaciente de que existe vida inteligente en otros
planetas, es que no han intentado contactar con nosotros. 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: [ATA] and re(4) stability issues

2008-12-15 Thread Victor Balada Diaz
On Mon, Dec 15, 2008 at 10:02:07AM +0100, Victor Balada Diaz wrote:
 Stopped stress testing this morning. After all the weekend testing
 seems the re(4) problems were fixed. No single interface up/down error.
 netstat -i reports no errors and everything is fine. Thanks a lot!
 
 I'm going to deploy the patches on our production machines.
 
 I've been able to trigger interrupt storms with ATA code, though.

After deploying it in various machines this night i've found in the
logs messages like this one:

re0: watchdog timeout (missed Tx interrupts) -- recovering

I know you told me this is harmless, so this is just so you
know it's happening.

Regards.

-- 
La prueba más fehaciente de que existe vida inteligente en otros
planetas, es que no han intentado contactar con nosotros. 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: [ATA] and re(4) stability issues

2008-12-15 Thread Pyun YongHyeon
On Tue, Dec 16, 2008 at 08:19:19AM +0100, Victor Balada Diaz wrote:
  On Mon, Dec 15, 2008 at 10:02:07AM +0100, Victor Balada Diaz wrote:
   Stopped stress testing this morning. After all the weekend testing
   seems the re(4) problems were fixed. No single interface up/down error.
   netstat -i reports no errors and everything is fine. Thanks a lot!
   
   I'm going to deploy the patches on our production machines.
   
   I've been able to trigger interrupt storms with ATA code, though.
  
  After deploying it in various machines this night i've found in the
  logs messages like this one:
  
  re0: watchdog timeout (missed Tx interrupts) -- recovering
  
  I know you told me this is harmless, so this is just so you

Yes, it's not real watchdog timeout as long as re(4) still works
correctly.

  know it's happening.
  

Ok. I'll update re(4) when I find spare time.

-- 
Regards,
Pyun YongHyeon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: [ATA] and re(4) stability issues

2008-12-12 Thread Victor Balada Diaz
On Thu, Dec 11, 2008 at 10:50:21AM +0100, Victor Balada Diaz wrote:
 On Thu, Dec 11, 2008 at 06:00:56PM +0900, Pyun YongHyeon wrote:
  
  I've reverted r185756 which caused GMII access issues on some
  controllers. If you are brave enough to try beta code, you can
  get latest re(4) in the following URL. Note, I don't have PCIe
  based RealTek controllers so the code was not tested at all.
  
  http://people.freebsd.org/~yongari/re/if_re.c
  http://people.freebsd.org/~yongari/re/if_rlreg.h
 
 I've recompiled the kernel with the first file in sys/dev/re/
 and the second one in sys/pci/. I'm still testing with MSI enabled.
 
 So far tried rebooting using nextboot(8) (just in case i lost the
 network card i could boot again) and the card seems to work
 but i'll continue stress testing the machine with stress + dd +
 iperf and see if i can take it down. I'll let you know how it goes.

After a day of stress testing the machine haven't got errors, interrupt
storms or interface up/down problems. Everything seems fine.
I'll continue stress testing the machine during the weekend, but
i would say that this time it's fixed.

Seems lately there have been a lot of testing of this driver. Is
there any chance of it being on 7.1 or being MFCed after the release
to RELENG_7?

Thanks a lot.
Regards.

-- 
La prueba más fehaciente de que existe vida inteligente en otros
planetas, es que no han intentado contactar con nosotros. 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: [ATA] and re(4) stability issues

2008-12-12 Thread Pyun YongHyeon
On Fri, Dec 12, 2008 at 01:13:09PM +0100, Victor Balada Diaz wrote:
  On Thu, Dec 11, 2008 at 10:50:21AM +0100, Victor Balada Diaz wrote:
   On Thu, Dec 11, 2008 at 06:00:56PM +0900, Pyun YongHyeon wrote:

I've reverted r185756 which caused GMII access issues on some
controllers. If you are brave enough to try beta code, you can
get latest re(4) in the following URL. Note, I don't have PCIe
based RealTek controllers so the code was not tested at all.

http://people.freebsd.org/~yongari/re/if_re.c
http://people.freebsd.org/~yongari/re/if_rlreg.h
   
   I've recompiled the kernel with the first file in sys/dev/re/
   and the second one in sys/pci/. I'm still testing with MSI enabled.
   
   So far tried rebooting using nextboot(8) (just in case i lost the
   network card i could boot again) and the card seems to work
   but i'll continue stress testing the machine with stress + dd +
   iperf and see if i can take it down. I'll let you know how it goes.
  
  After a day of stress testing the machine haven't got errors, interrupt
  storms or interface up/down problems. Everything seems fine.
  I'll continue stress testing the machine during the weekend, but
  i would say that this time it's fixed.
  

Thanks for testing!

  Seems lately there have been a lot of testing of this driver. Is
  there any chance of it being on 7.1 or being MFCed after the release
  to RELENG_7?
  

It's too early to say MFC but I think MFC would be done after
releasing 7.1-RELEASE if all goes well.

-- 
Regards,
Pyun YongHyeon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: [ATA] and re(4) stability issues

2008-12-11 Thread Victor Balada Diaz
On Thu, Dec 11, 2008 at 05:05:59AM +1100, Peter Jeremy wrote:
 On 2008-Dec-10 10:55:35 +0100, Søren Schmidt [EMAIL PROTECTED] wrote:
 And you will not use 64bit DMA even if the chipset supports it.  
 However I have not seen any chipsets supporting this fail, YMMV as  
 usual :)
 
 There's a reference in wikipedia pointing to
 http://www.mail-archive.com/[EMAIL PROTECTED]/msg06694.html
 that claims the AMD/ATI SB600 lies about supporting 64-bit DMA in AHCI
 mode.  I have a SB600 but it doesn't have 4GB to test on.

I have 6 GB of RAM and can test patches, so once i'm done with the re(4)
side of things i'll try commenting the code Soren's suggested and see
if that improves the situation.

Regards.

-- 
La prueba más fehaciente de que existe vida inteligente en otros
planetas, es que no han intentado contactar con nosotros. 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [ATA] and re(4) stability issues

2008-12-11 Thread Victor Balada Diaz
On Thu, Dec 11, 2008 at 08:57:07AM +0100, Victor Balada Diaz wrote:
 On Wed, Dec 10, 2008 at 09:07:19PM +0900, Pyun YongHyeon wrote:
  On Wed, Dec 10, 2008 at 12:32:25PM +0100, Victor Balada Diaz wrote:
Also i didn't see any problem with interfaces going up and down,
but that usually happen after some hours of uptime, so i'll let
you know if the error happens again.

 
 After writing to the HD with dd for a few hours and using
 stress -i 10 -d 10 the machine lost connectivity. I waited until
 today to be sure if the machine hung, paniced or just lost network
 connectivity. I don't have local access or serial access, so this
 is the only way i could do it. I've seen in the logs during the
 night various messages of:
 
 
 Dec 10 00:33:49 yac kernel: re0: watchdog timeout
 Dec 10 00:33:49 yac kernel: re0: link state changed to DOWN
 Dec 10 00:33:52 yac kernel: re0: link state changed to UP
 
 The interface never recovered and i wasn't able to ping the machine
 until i rebooted. Nagios was checking all the time and no recovery
 happened.
 
 The netstat -i in daily scripts shows just one Oerrs. I'm used to
 have a lot of them, but seems this time the card didn't recover from
 the only one. I also want to say that this is not a regression, as
 it happened before with 7.1 -BETA 2 code.
 
 Is there anything more i can try?

Sorry it's too early in the morning and i thought today was 10
instead of 11. I don't even know the day i'm today.

Looking at today's log i see no link state changed messages
but i see this other messages that started happening more or
less at the same time i lost connectivity to the server:

Dec 10 18:20:32 yac kernel: re0: link state changed to DOWN
Dec 10 18:20:32 yac kernel: re0: PHY read failed

Sorry for the noise.

Regards.
-- 
La prueba más fehaciente de que existe vida inteligente en otros
planetas, es que no han intentado contactar con nosotros. 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [ATA] and re(4) stability issues

2008-12-11 Thread Pyun YongHyeon
On Thu, Dec 11, 2008 at 09:10:45AM +0100, Victor Balada Diaz wrote:
  On Thu, Dec 11, 2008 at 08:57:07AM +0100, Victor Balada Diaz wrote:
   On Wed, Dec 10, 2008 at 09:07:19PM +0900, Pyun YongHyeon wrote:
On Wed, Dec 10, 2008 at 12:32:25PM +0100, Victor Balada Diaz wrote:
  Also i didn't see any problem with interfaces going up and down,
  but that usually happen after some hours of uptime, so i'll let
  you know if the error happens again.
  
   
   After writing to the HD with dd for a few hours and using
   stress -i 10 -d 10 the machine lost connectivity. I waited until
   today to be sure if the machine hung, paniced or just lost network
   connectivity. I don't have local access or serial access, so this
   is the only way i could do it. I've seen in the logs during the
   night various messages of:
   
   
   Dec 10 00:33:49 yac kernel: re0: watchdog timeout
   Dec 10 00:33:49 yac kernel: re0: link state changed to DOWN
   Dec 10 00:33:52 yac kernel: re0: link state changed to UP
   
   The interface never recovered and i wasn't able to ping the machine
   until i rebooted. Nagios was checking all the time and no recovery
   happened.
   
   The netstat -i in daily scripts shows just one Oerrs. I'm used to
   have a lot of them, but seems this time the card didn't recover from
   the only one. I also want to say that this is not a regression, as
   it happened before with 7.1 -BETA 2 code.
   
   Is there anything more i can try?
  
  Sorry it's too early in the morning and i thought today was 10
  instead of 11. I don't even know the day i'm today.
  
  Looking at today's log i see no link state changed messages
  but i see this other messages that started happening more or
  less at the same time i lost connectivity to the server:
  
  Dec 10 18:20:32 yac kernel: re0: link state changed to DOWN
  Dec 10 18:20:32 yac kernel: re0: PHY read failed
  

I've reverted r185756 which caused GMII access issues on some
controllers. If you are brave enough to try beta code, you can
get latest re(4) in the following URL. Note, I don't have PCIe
based RealTek controllers so the code was not tested at all.

http://people.freebsd.org/~yongari/re/if_re.c
http://people.freebsd.org/~yongari/re/if_rlreg.h

-- 
Regards,
Pyun YongHyeon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [ATA] and re(4) stability issues

2008-12-11 Thread Victor Balada Diaz
On Thu, Dec 11, 2008 at 06:00:56PM +0900, Pyun YongHyeon wrote:
 On Thu, Dec 11, 2008 at 09:10:45AM +0100, Victor Balada Diaz wrote:
   On Thu, Dec 11, 2008 at 08:57:07AM +0100, Victor Balada Diaz wrote:
On Wed, Dec 10, 2008 at 09:07:19PM +0900, Pyun YongHyeon wrote:
 On Wed, Dec 10, 2008 at 12:32:25PM +0100, Victor Balada Diaz wrote:
   Also i didn't see any problem with interfaces going up and down,
   but that usually happen after some hours of uptime, so i'll let
   you know if the error happens again.
   

After writing to the HD with dd for a few hours and using
stress -i 10 -d 10 the machine lost connectivity. I waited until
today to be sure if the machine hung, paniced or just lost network
connectivity. I don't have local access or serial access, so this
is the only way i could do it. I've seen in the logs during the
night various messages of:


Dec 10 00:33:49 yac kernel: re0: watchdog timeout
Dec 10 00:33:49 yac kernel: re0: link state changed to DOWN
Dec 10 00:33:52 yac kernel: re0: link state changed to UP

The interface never recovered and i wasn't able to ping the machine
until i rebooted. Nagios was checking all the time and no recovery
happened.

The netstat -i in daily scripts shows just one Oerrs. I'm used to
have a lot of them, but seems this time the card didn't recover from
the only one. I also want to say that this is not a regression, as
it happened before with 7.1 -BETA 2 code.

Is there anything more i can try?
   
   Sorry it's too early in the morning and i thought today was 10
   instead of 11. I don't even know the day i'm today.
   
   Looking at today's log i see no link state changed messages
   but i see this other messages that started happening more or
   less at the same time i lost connectivity to the server:
   
   Dec 10 18:20:32 yac kernel: re0: link state changed to DOWN
   Dec 10 18:20:32 yac kernel: re0: PHY read failed
   
 
 I've reverted r185756 which caused GMII access issues on some
 controllers. If you are brave enough to try beta code, you can
 get latest re(4) in the following URL. Note, I don't have PCIe
 based RealTek controllers so the code was not tested at all.
 
 http://people.freebsd.org/~yongari/re/if_re.c
 http://people.freebsd.org/~yongari/re/if_rlreg.h

I've recompiled the kernel with the first file in sys/dev/re/
and the second one in sys/pci/. I'm still testing with MSI enabled.

So far tried rebooting using nextboot(8) (just in case i lost the
network card i could boot again) and the card seems to work
but i'll continue stress testing the machine with stress + dd +
iperf and see if i can take it down. I'll let you know how it goes.

Regards.

-- 
La prueba más fehaciente de que existe vida inteligente en otros
planetas, es que no han intentado contactar con nosotros. 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [ATA] and re(4) stability issues

2008-12-10 Thread Andrey V. Elsukov

Victor Balada Diaz wrote:

Digging at linux source code i've found that they do some special things
for this chipset that i've been unable to find on our code. This is
linux code for my chipset:

371 AHCI_HFLAGS (AHCI_HFLAG_IGN_SERR_INTERNAL |
372  AHCI_HFLAG_32BIT_ONLY | AHCI_HFLAG_NO_MSI |
373  AHCI_HFLAG_SECT255),

File and the rest of the code in here[3].

As i saw AHCI_HFLAG_NO_MSI i tried doing the easiest thing i could
think of, switching MSI and MSI-x off for the whole system, so
i added to /boot/loader.conf this tunables:


FreeBSD's ata(4) driver doesn't support MSI. This flag in linux's libata used in

if ((hpriv-flags  AHCI_HFLAG_NO_MSI) || pci_enable_msi(pdev))
pci_intx(pdev, 1);

In FreeBSD's code we have the same:

/* enable PCI interrupt */
pci_write_config(dev, PCIR_COMMAND,
 pci_read_config(dev, PCIR_COMMAND, 2)  ~0x0400, 2);

AHCI_HFLAG_IGN_SERR_INTERNAL flag targeted to ignore SERR_INTERNAL errors.
FreeBSD's ata(4) driver ignores they too.

AHCI_HFLAG_32BIT_ONLY flag limits to use 32-bit DMA only.
If AHCI CAP register reports that controller supports 64-bit DMA driver will 
use 64-bit.
So i think there can be added one quirk for you, but i'm not sure that problem 
is here..

AHCI_HFLAG_SECT255 flag limits I/O operation to 255 sectors, FreeBSD uses 
128-limit
by default.

--
WBR, Andrey V. Elsukov

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [ATA] and re(4) stability issues

2008-12-10 Thread Victor Balada Diaz
On Wed, Dec 10, 2008 at 03:12:26PM +0900, Pyun YongHyeon wrote:
 On Tue, Dec 09, 2008 at 07:52:37PM +0100, Victor Balada Diaz wrote:
   Hello,
   
   I got various machines[1] at hetzner.de and I've been having problems
   with interrupts on FreeBSD 7.0 and now FreeBSD 7.1 -BETA2 in amd64. I've
   been trying to narrow the problem so someone more knowledgeable than me
   is able to fix it. This mail is an other attempt to ask a question
   with regards ATA code to see if this time i got something.
   
   For the ones that don't actually know what happened:
   
   With FreeBSD 7.0 -RELEASE for amd64 and default kernel
   the system shared re0 interrupt with OHCI and this caused
   re(4) to corrupt packets and create interrupt storms. Tried
 
 re(4) in 7.0-RELEASE had bus_dma(9) bug which could be easily
 triggered on systems with  4GB memory. But I dont' know whether
 this is related with interrupt storms.
 
   updating to 7.1 -BETA2 and still had some problems with it.
   
   I've opened the PR kern/128287[2] and Remko quickly answered
   with a workaround: that workaround was removing USB support from
   my kernel. I did it and re(4) wasn't sharing interrupts anylonger,
   and the interrupt storms were gone. Now sometime later the interface
   goes up and down from time to time, but less often. Also sometimes
   the machine losts the network interface but continues to work.
   
 
 It seems that your controller supports MSI so you can set a tunable
 hw.re.msi_disable to 0 to enable MSI. With MSI you can remove
 interrupt sharing(e.g. add hw.re.msi_disable=0 to
 /boot/loader.conf file.) However there were several issues on re(4)
 w.r.t MSI so it was off by default.

This is undocumented and with sysctl -a i can't find the tunable. Is this
a HEAD feature or it's also in 7.1 -BETA2? Should i add
hw.re_msi_disable=0 to /boot/loader.conf?

This was sharing interrupt with USB, does USB need any special MSI handling
or with re using MSI is enough to not share the interrupt?


 
   I know it continues to work because some days later i can see that
   it tried to deliver the status reports but was unable to resolve the
   aliases hostnames. I can't ping the machine and i know the network
   is OK. If i reboot the machine everything is working again.
   
 
 Recently I've made small changes to re(4) which may help to detect
 link state change event. Would you try re(4) in HEAD?

Can i just drop HEAD's /stable/7/sys/dev/re/ in -STABLE and test that
or do i need to test the whole HEAD kernel?

Regards.

-- 
La prueba más fehaciente de que existe vida inteligente en otros
planetas, es que no han intentado contactar con nosotros. 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [ATA] and re(4) stability issues

2008-12-10 Thread Victor Balada Diaz
On Wed, Dec 10, 2008 at 11:58:12AM +0300, Andrey V. Elsukov wrote:
 Victor Balada Diaz wrote:
 Digging at linux source code i've found that they do some special things
 for this chipset that i've been unable to find on our code. This is
 linux code for my chipset:
 
 371 AHCI_HFLAGS (AHCI_HFLAG_IGN_SERR_INTERNAL |
 372  AHCI_HFLAG_32BIT_ONLY | 
 AHCI_HFLAG_NO_MSI |
 373  AHCI_HFLAG_SECT255),
 
 File and the rest of the code in here[3].
 
 As i saw AHCI_HFLAG_NO_MSI i tried doing the easiest thing i could
 think of, switching MSI and MSI-x off for the whole system, so
 i added to /boot/loader.conf this tunables:
 
 FreeBSD's ata(4) driver doesn't support MSI. This flag in linux's libata 
 used in
 
 if ((hpriv-flags  AHCI_HFLAG_NO_MSI) || pci_enable_msi(pdev))
 pci_intx(pdev, 1);
 
 In FreeBSD's code we have the same:
 
 /* enable PCI interrupt */
 pci_write_config(dev, PCIR_COMMAND,
  pci_read_config(dev, PCIR_COMMAND, 2)  ~0x0400, 2);
 
 AHCI_HFLAG_IGN_SERR_INTERNAL flag targeted to ignore SERR_INTERNAL errors.
 FreeBSD's ata(4) driver ignores they too.
 
 AHCI_HFLAG_32BIT_ONLY flag limits to use 32-bit DMA only.
 If AHCI CAP register reports that controller supports 64-bit DMA driver 
 will use 64-bit.
 So i think there can be added one quirk for you, but i'm not sure that 
 problem is here..
 
 AHCI_HFLAG_SECT255 flag limits I/O operation to 255 sectors, FreeBSD uses 
 128-limit
 by default.

Thanks for explaining me what the flags do. I'm not skilled enough to create
the DMA quirks but if you could give me some patches i'll test them. Also
if you have any other idea on what could i test or how can i debug this
it would be more than welcome.

Thanks.
Regards.

-- 
La prueba más fehaciente de que existe vida inteligente en otros
planetas, es que no han intentado contactar con nosotros. 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [ATA] and re(4) stability issues

2008-12-10 Thread Søren Schmidt

On 10Dec, 2008, at 10:11 , Victor Balada Diaz wrote:


Thanks for explaining me what the flags do. I'm not skilled enough  
to create
the DMA quirks but if you could give me some patches i'll test them.  
Also
if you have any other idea on what could i test or how can i debug  
this

it would be more than welcome.



Comment out the following two lines in ata_ahci_dmainit():

if (ATA_INL(ctlr-r_res2, ATA_AHCI_CAP)  ATA_AHCI_CAP_64BIT)
ch-dma-max_address = BUS_SPACE_MAXADDR;

And you will not use 64bit DMA even if the chipset supports it.  
However I have not seen any chipsets supporting this fail, YMMV as  
usual :)


-Søren






___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [ATA] and re(4) stability issues

2008-12-10 Thread Victor Balada Diaz
On Wed, Dec 10, 2008 at 07:28:00PM +0900, Pyun YongHyeon wrote:
 On Wed, Dec 10, 2008 at 09:59:35AM +0100, Victor Balada Diaz wrote:
   On Wed, Dec 10, 2008 at 03:12:26PM +0900, Pyun YongHyeon wrote:
On Tue, Dec 09, 2008 at 07:52:37PM +0100, Victor Balada Diaz wrote:
  Hello,
  
  I got various machines[1] at hetzner.de and I've been having problems
  with interrupts on FreeBSD 7.0 and now FreeBSD 7.1 -BETA2 in amd64. 
 I've
  been trying to narrow the problem so someone more knowledgeable than 
 me
  is able to fix it. This mail is an other attempt to ask a question
  with regards ATA code to see if this time i got something.
  
  For the ones that don't actually know what happened:
  
  With FreeBSD 7.0 -RELEASE for amd64 and default kernel
  the system shared re0 interrupt with OHCI and this caused
  re(4) to corrupt packets and create interrupt storms. Tried

re(4) in 7.0-RELEASE had bus_dma(9) bug which could be easily
triggered on systems with  4GB memory. But I dont' know whether
this is related with interrupt storms.

  updating to 7.1 -BETA2 and still had some problems with it.
  
  I've opened the PR kern/128287[2] and Remko quickly answered
  with a workaround: that workaround was removing USB support from
  my kernel. I did it and re(4) wasn't sharing interrupts anylonger,
  and the interrupt storms were gone. Now sometime later the interface
  goes up and down from time to time, but less often. Also sometimes
  the machine losts the network interface but continues to work.
  

It seems that your controller supports MSI so you can set a tunable
hw.re.msi_disable to 0 to enable MSI. With MSI you can remove
interrupt sharing(e.g. add hw.re.msi_disable=0 to
/boot/loader.conf file.) However there were several issues on re(4)
w.r.t MSI so it was off by default.
   
   This is undocumented and with sysctl -a i can't find the tunable. Is this
   a HEAD feature or it's also in 7.1 -BETA2? Should i add
 
 Yeah it's an undocmented feature. But most drivers written by me
 have similar kobs. Both HEAD and stable/7 including 7.1 BETA2 have
 the tunable.

I think it could be great if you could document it or at least
show it by default when you do sysctl -ad with a small description.

 
   hw.re_msi_disable=0 to /boot/loader.conf?
^
Shoule be hw.re.msi_disable=0
   
 
 Yes, just add it to /boot/loader.conf. Note, you should not disable
 system-wide MSI control(e.g. hw.pci.enable_msi == 1).
 
   This was sharing interrupt with USB, does USB need any special MSI handling
   or with re using MSI is enough to not share the interrupt?
 
 If re(4) can use MSI, you don't need to worry about interrupt
 sharing with USB. Check the output of vmstat -i. You normally get
 an irq256 or higher for MSI enabled driver.
 
   
   

  I know it continues to work because some days later i can see that
  it tried to deliver the status reports but was unable to resolve the
  aliases hostnames. I can't ping the machine and i know the network
  is OK. If i reboot the machine everything is working again.
  

Recently I've made small changes to re(4) which may help to detect
link state change event. Would you try re(4) in HEAD?
   
   Can i just drop HEAD's /stable/7/sys/dev/re/ in -STABLE and test that
 
 Yes, you can. It should build without problems. Just replace re(4) on
 stable/7 with HEAD version.
 
   or do i need to test the whole HEAD kernel?
   
 
 No you don't have to that.

Backporting the changes i've found that it didn't compile so in
the end i got from HEAD the following files:

base/head/sys/dev/re/if_re.c
base/head/sys/pci/if_rl.c
base/head/sys/pci/if_rlreg.h

After that i've recompiled 7.1 -BETA2 GENERIC kernel and enabled
the knob you suggested in /boot/loader.conf.

With the new kernel and MSI the interrupts are like this:

# vmstat -i
interrupt  total   rate
irq9: acpi01  0
irq16: ohci0   1  0
irq17: ohci1 ohci3 1  0
irq18: ohci2 ohci4 1  0
irq22: atapci0 19215 15
cpu0: timer  2502718   1998
irq256: re0  4967726   3967
cpu1: timer  2502525   1998
Total9992188   7980

The high interrupt numbers are because i've been running iperf to
check everything it's fine, not because of interrupt storms. So far
i didn't find any interrupt storms related to USB or re(4) driver
but while doing the tests i've found this error:

re0: watchdog timeout (missed Tx interrupts) -- recovering

This didn't create any error on the interfaces (netstat -i).

Also i didn't see any problem with interfaces going up and down,
but that usually happen after some hours 

Re: [ATA] and re(4) stability issues

2008-12-10 Thread Pyun YongHyeon
On Wed, Dec 10, 2008 at 12:32:25PM +0100, Victor Balada Diaz wrote:
  On Wed, Dec 10, 2008 at 07:28:00PM +0900, Pyun YongHyeon wrote:
   On Wed, Dec 10, 2008 at 09:59:35AM +0100, Victor Balada Diaz wrote:
 On Wed, Dec 10, 2008 at 03:12:26PM +0900, Pyun YongHyeon wrote:
  On Tue, Dec 09, 2008 at 07:52:37PM +0100, Victor Balada Diaz wrote:
Hello,

I got various machines[1] at hetzner.de and I've been having 
   problems
with interrupts on FreeBSD 7.0 and now FreeBSD 7.1 -BETA2 in 
   amd64. I've
been trying to narrow the problem so someone more knowledgeable 
   than me
is able to fix it. This mail is an other attempt to ask a question
with regards ATA code to see if this time i got something.

For the ones that don't actually know what happened:

With FreeBSD 7.0 -RELEASE for amd64 and default kernel
the system shared re0 interrupt with OHCI and this caused
re(4) to corrupt packets and create interrupt storms. Tried
  
  re(4) in 7.0-RELEASE had bus_dma(9) bug which could be easily
  triggered on systems with  4GB memory. But I dont' know whether
  this is related with interrupt storms.
  
updating to 7.1 -BETA2 and still had some problems with it.

I've opened the PR kern/128287[2] and Remko quickly answered
with a workaround: that workaround was removing USB support from
my kernel. I did it and re(4) wasn't sharing interrupts anylonger,
and the interrupt storms were gone. Now sometime later the 
   interface
goes up and down from time to time, but less often. Also sometimes
the machine losts the network interface but continues to work.

  
  It seems that your controller supports MSI so you can set a tunable
  hw.re.msi_disable to 0 to enable MSI. With MSI you can remove
  interrupt sharing(e.g. add hw.re.msi_disable=0 to
  /boot/loader.conf file.) However there were several issues on re(4)
  w.r.t MSI so it was off by default.
 
 This is undocumented and with sysctl -a i can't find the tunable. Is 
   this
 a HEAD feature or it's also in 7.1 -BETA2? Should i add
   
   Yeah it's an undocmented feature. But most drivers written by me
   have similar kobs. Both HEAD and stable/7 including 7.1 BETA2 have
   the tunable.
  
  I think it could be great if you could document it or at least
  show it by default when you do sysctl -ad with a small description.
  

If MSI worked as expected I would have documented it as I did
in msk(4)/nfe(4)/ale(4)/age(4)/jme(4) etc.
Using MSI on RealTek does not seem to stable. I tried hard to fix
that but some users still reported watchdog timeouts. Working
without documentation and hardware also made it hard to complete
the work. This was the main reason why MSI was disabled on re(4).

   
 hw.re_msi_disable=0 to /boot/loader.conf?
  ^
  Shoule be hw.re.msi_disable=0
 
   
   Yes, just add it to /boot/loader.conf. Note, you should not disable
   system-wide MSI control(e.g. hw.pci.enable_msi == 1).
   
 This was sharing interrupt with USB, does USB need any special MSI 
   handling
 or with re using MSI is enough to not share the interrupt?
   
   If re(4) can use MSI, you don't need to worry about interrupt
   sharing with USB. Check the output of vmstat -i. You normally get
   an irq256 or higher for MSI enabled driver.
   
 
 
  
I know it continues to work because some days later i can see that
it tried to deliver the status reports but was unable to resolve 
   the
aliases hostnames. I can't ping the machine and i know the network
is OK. If i reboot the machine everything is working again.

  
  Recently I've made small changes to re(4) which may help to detect
  link state change event. Would you try re(4) in HEAD?
 
 Can i just drop HEAD's /stable/7/sys/dev/re/ in -STABLE and test that
   
   Yes, you can. It should build without problems. Just replace re(4) on
   stable/7 with HEAD version.
   
 or do i need to test the whole HEAD kernel?
 
   
   No you don't have to that.
  
  Backporting the changes i've found that it didn't compile so in
  the end i got from HEAD the following files:
  
  base/head/sys/dev/re/if_re.c
  base/head/sys/pci/if_rl.c
  base/head/sys/pci/if_rlreg.h
  

Ah,, sorry about that. Recently there was some changes. I forgot
that.

  After that i've recompiled 7.1 -BETA2 GENERIC kernel and enabled
  the knob you suggested in /boot/loader.conf.
  
  With the new kernel and MSI the interrupts are like this:
  
  # vmstat -i
  interrupt  total   rate
  irq9: acpi01  0
  irq16: ohci0   1  0
  irq17: ohci1 ohci3 1  0
  irq18: ohci2 ohci4 1  0
  

Re: [ATA] and re(4) stability issues

2008-12-10 Thread Arnaud Houdelette

Victor Balada Diaz a écrit :

Hello,

I got various machines[1] at hetzner.de and I've been having problems
with interrupts on FreeBSD 7.0 and now FreeBSD 7.1 -BETA2 in amd64. I've
been trying to narrow the problem so someone more knowledgeable than me
is able to fix it. This mail is an other attempt to ask a question
with regards ATA code to see if this time i got something.

For the ones that don't actually know what happened:

With FreeBSD 7.0 -RELEASE for amd64 and default kernel
the system shared re0 interrupt with OHCI and this caused
re(4) to corrupt packets and create interrupt storms. Tried
updating to 7.1 -BETA2 and still had some problems with it.

I've opened the PR kern/128287[2] and Remko quickly answered
with a workaround: that workaround was removing USB support from
my kernel. I did it and re(4) wasn't sharing interrupts anylonger,
and the interrupt storms were gone. Now sometime later the interface
goes up and down from time to time, but less often. Also sometimes
the machine losts the network interface but continues to work.

I know it continues to work because some days later i can see that
it tried to deliver the status reports but was unable to resolve the
aliases hostnames. I can't ping the machine and i know the network
is OK. If i reboot the machine everything is working again.

When switched from 7.0 to 7.1 BETA2 i also found that under load
after some hours the machine created interrupt storms on ATA disks.

Digging at linux source code i've found that they do some special things
for this chipset that i've been unable to find on our code. This is
linux code for my chipset:

371 AHCI_HFLAGS (AHCI_HFLAG_IGN_SERR_INTERNAL |
372  AHCI_HFLAG_32BIT_ONLY | AHCI_HFLAG_NO_MSI |
373  AHCI_HFLAG_SECT255),

File and the rest of the code in here[3].

As i saw AHCI_HFLAG_NO_MSI i tried doing the easiest thing i could
think of, switching MSI and MSI-x off for the whole system, so
i added to /boot/loader.conf this tunables:

hw.pci.enable_msix=0
hw.pci.enable_msi=0

And then rebooted the machine. After various hours of doing almost nothing
i've found that the machine answered ping but was unable to answer any
request (eg, ssh, nagios nrpe, etc). The machine recovered itself after
some minutes and when i was able to ssh into i saw the following in dmesg:

ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing 
request directly
ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing 
request directly
ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request 
directly
ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request 
directly
ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly
ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=1463123158

and a lot more errors like that. I didn't get this errors with MSI enabled.
I see WRITE_DMA48 and in linux code i saw AHCI_HFLAG_32BIT_ONLY which is later
used for DMA related things. Could someone who is more knowledgeable check
if we're doing the right thing?

I've attached verbose dmesg of a machine that's like this one with
7.1 -BETA2, MSI enabled and GENERIC kernel minus USB and firewrire.

Also, please, could someone give me a hand on how could i continue debugging
this interrupt issues? I'm a bit lost and digging code and posting each
time i think i've found something is not going to go anywhere.

I would also like to say that i've seen reports of this kind of problems
on amd64 machines in the lists since various years ago, so i don't think
this is just a problem with this BIOS/motherboard (MSI K9AG Neo2 Digital)
on the lists


Thanks in advance for any help.
Regards.


[1]: http://www.hetzner.de/hosting/produkte_rootserver/ds7000/
[2]: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/128287
[3]: http://fxr.watson.org/fxr/source/drivers/ata/ahci.c?v=linux-2.6#L369
  


Sorry I didn't take the time to read all the thread, but I got similar 
problem with the same IXP600 chipset.
Only it was'nt with a Realtek NIC (re) but with a Ralink wireless one. 
The simptoms where similar : interrupt 22 was shared between the sata 
controler and the wireless card. And I got Interrupt Storms at random 
times when using the wireless network.


No problem since I removed the ral(4) NIC (got a real access point now).
You might not want to point the finger at the re(4) driver too fast.

Arnaud Houdelette


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [ATA] and re(4) stability issues

2008-12-10 Thread Pyun YongHyeon
On Wed, Dec 10, 2008 at 09:59:35AM +0100, Victor Balada Diaz wrote:
  On Wed, Dec 10, 2008 at 03:12:26PM +0900, Pyun YongHyeon wrote:
   On Tue, Dec 09, 2008 at 07:52:37PM +0100, Victor Balada Diaz wrote:
 Hello,
 
 I got various machines[1] at hetzner.de and I've been having problems
 with interrupts on FreeBSD 7.0 and now FreeBSD 7.1 -BETA2 in amd64. I've
 been trying to narrow the problem so someone more knowledgeable than me
 is able to fix it. This mail is an other attempt to ask a question
 with regards ATA code to see if this time i got something.
 
 For the ones that don't actually know what happened:
 
 With FreeBSD 7.0 -RELEASE for amd64 and default kernel
 the system shared re0 interrupt with OHCI and this caused
 re(4) to corrupt packets and create interrupt storms. Tried
   
   re(4) in 7.0-RELEASE had bus_dma(9) bug which could be easily
   triggered on systems with  4GB memory. But I dont' know whether
   this is related with interrupt storms.
   
 updating to 7.1 -BETA2 and still had some problems with it.
 
 I've opened the PR kern/128287[2] and Remko quickly answered
 with a workaround: that workaround was removing USB support from
 my kernel. I did it and re(4) wasn't sharing interrupts anylonger,
 and the interrupt storms were gone. Now sometime later the interface
 goes up and down from time to time, but less often. Also sometimes
 the machine losts the network interface but continues to work.
 
   
   It seems that your controller supports MSI so you can set a tunable
   hw.re.msi_disable to 0 to enable MSI. With MSI you can remove
   interrupt sharing(e.g. add hw.re.msi_disable=0 to
   /boot/loader.conf file.) However there were several issues on re(4)
   w.r.t MSI so it was off by default.
  
  This is undocumented and with sysctl -a i can't find the tunable. Is this
  a HEAD feature or it's also in 7.1 -BETA2? Should i add

Yeah it's an undocmented feature. But most drivers written by me
have similar kobs. Both HEAD and stable/7 including 7.1 BETA2 have
the tunable.

  hw.re_msi_disable=0 to /boot/loader.conf?
   ^
   Shoule be hw.re.msi_disable=0
  

Yes, just add it to /boot/loader.conf. Note, you should not disable
system-wide MSI control(e.g. hw.pci.enable_msi == 1).

  This was sharing interrupt with USB, does USB need any special MSI handling
  or with re using MSI is enough to not share the interrupt?

If re(4) can use MSI, you don't need to worry about interrupt
sharing with USB. Check the output of vmstat -i. You normally get
an irq256 or higher for MSI enabled driver.

  
  
   
 I know it continues to work because some days later i can see that
 it tried to deliver the status reports but was unable to resolve the
 aliases hostnames. I can't ping the machine and i know the network
 is OK. If i reboot the machine everything is working again.
 
   
   Recently I've made small changes to re(4) which may help to detect
   link state change event. Would you try re(4) in HEAD?
  
  Can i just drop HEAD's /stable/7/sys/dev/re/ in -STABLE and test that

Yes, you can. It should build without problems. Just replace re(4) on
stable/7 with HEAD version.

  or do i need to test the whole HEAD kernel?
  

No you don't have to that.

-- 
Regards,
Pyun YongHyeon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [ATA] and re(4) stability issues

2008-12-10 Thread Oliver Peter
On Tue, Dec 09, 2008 at 07:52:37PM +0100, Victor Balada Diaz wrote:
 Hello,
 
 I got various machines[1] at hetzner.de and I've been having problems
 with interrupts on FreeBSD 7.0 and now FreeBSD 7.1 -BETA2 in amd64. I've
 been trying to narrow the problem so someone more knowledgeable than me
 is able to fix it. This mail is an other attempt to ask a question
 with regards ATA code to see if this time i got something.

Just want to add a quick note and say that I'm having the same problem
with my 7.0-RELEASE-p6/amd64 hetzner machine:

 http://lists.freebsd.org/pipermail/freebsd-acpi/2008-September/005095.html

I would be happy to test patches as well.  Thanks.

-- 
Oliver PETER, email: [EMAIL PROTECTED], ICQ# 113969174
If it feels good, you're doing something wrong.
  -- Coach McTavish
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [ATA] and re(4) stability issues

2008-12-10 Thread Victor Balada Diaz
On Wed, Dec 10, 2008 at 12:08:40PM +, Oliver Peter wrote:
 On Tue, Dec 09, 2008 at 07:52:37PM +0100, Victor Balada Diaz wrote:
  Hello,
  
  I got various machines[1] at hetzner.de and I've been having problems
  with interrupts on FreeBSD 7.0 and now FreeBSD 7.1 -BETA2 in amd64. I've
  been trying to narrow the problem so someone more knowledgeable than me
  is able to fix it. This mail is an other attempt to ask a question
  with regards ATA code to see if this time i got something.
 
 Just want to add a quick note and say that I'm having the same problem
 with my 7.0-RELEASE-p6/amd64 hetzner machine:
 
  
 http://lists.freebsd.org/pipermail/freebsd-acpi/2008-September/005095.html
 
 I would be happy to test patches as well.  Thanks.

Hello Oliver,

What i did so far and improved a lot the experience was:

1) Upgrade at least the if_re code to RELENG_7. This fixes issues
   of packet corruption on ssh sessions.

2) Delete from your kernel config USB and firewire. This prevents
   the realtek interrupt to be shared.

After this, with 7.1 -BETA2 the systems are more or less stable, but
after a while the ATA controller starts to create interrupt storms.
I wasn't able to find why.

With the help that i've received in this thread from Pyun
YongHyeon (Thanks!!) i'm also trying this suggestions:

3) Backport this 3 files from current to 7.1 -BETA2:

base/head/sys/dev/re/if_re.c
base/head/sys/pci/if_rl.c
base/head/sys/pci/if_rlreg.h

You can fetch them from http://svn.freebsd.org/. With them and
adding to /boot/loader.conf this tunable:

hw.re.msi_disable=0

I can use GENERIC kernel again (ie, USB enabled) and so far
i didn't find any problem yet. No more interface up/down problems
and no more interrupt storms. I must say that i haven't tested
this enough, because the interrupt storms in ATA code start to
happen after a few days of uptime load, but at least the problems
with the realtek seem to be gone. 

If you upgrade to 7.1 -BETA2 you'll also get SATA support for
the IXP card. With 7.0 it will work as ATA 33 in compatibility mode.

Maybe someone with write access to the wiki could add it somewhere
so that other hetzner users that are having the same problems
could use the same workarounds :)

I hope this helps you.

Regards.

-- 
La prueba más fehaciente de que existe vida inteligente en otros
planetas, es que no han intentado contactar con nosotros. 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [ATA] and re(4) stability issues

2008-12-10 Thread Gary Jennejohn
On Wed, 10 Dec 2008 21:07:19 +0900
Pyun YongHyeon [EMAIL PROTECTED] wrote:
 On Wed, Dec 10, 2008 at 12:32:25PM +0100, Victor Balada Diaz wrote:

   As these seems to improve the current situation, is there any
   chance of merging -current driver in 7.1 before release?
  

 I think re(4) in HEAD needs more testing. As you might know RealTek
 produced too many chipsets. :-(


FYI I've now turned MSI on in HEAD and will see what happens.  Before
my re0 was sharing interrupts with 3 USB controllers.  Now it's all
by itself on irq256.

I'm running amd64 with

re0: RealTek 8168/8168B/8168C/8168CP/8168D/8111B/8111C/8111CP PCIe
Gigabit Ethernet port 0xde00-0xdeff mem 0xfdaff000-0xfdaf,
0xfdae-0xfdae irq 18 at device 0.0 on pci2

---
Gary Jennejohn
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [ATA] and re(4) stability issues

2008-12-10 Thread Victor Balada Diaz
On Wed, Dec 10, 2008 at 09:07:19PM +0900, Pyun YongHyeon wrote:
 On Wed, Dec 10, 2008 at 12:32:25PM +0100, Victor Balada Diaz wrote:
   On Wed, Dec 10, 2008 at 07:28:00PM +0900, Pyun YongHyeon wrote:
On Wed, Dec 10, 2008 at 09:59:35AM +0100, Victor Balada Diaz wrote:
  On Wed, Dec 10, 2008 at 03:12:26PM +0900, Pyun YongHyeon wrote:
   On Tue, Dec 09, 2008 at 07:52:37PM +0100, Victor Balada Diaz wrote:
 Hello,
 
 I got various machines[1] at hetzner.de and I've been having 
 problems
 with interrupts on FreeBSD 7.0 and now FreeBSD 7.1 -BETA2 in 
 amd64. I've
 been trying to narrow the problem so someone more knowledgeable 
 than me
 is able to fix it. This mail is an other attempt to ask a 
 question
 with regards ATA code to see if this time i got something.
 
 For the ones that don't actually know what happened:
 
 With FreeBSD 7.0 -RELEASE for amd64 and default kernel
 the system shared re0 interrupt with OHCI and this caused
 re(4) to corrupt packets and create interrupt storms. Tried
   
   re(4) in 7.0-RELEASE had bus_dma(9) bug which could be easily
   triggered on systems with  4GB memory. But I dont' know whether
   this is related with interrupt storms.
   
 updating to 7.1 -BETA2 and still had some problems with it.
 
 I've opened the PR kern/128287[2] and Remko quickly answered
 with a workaround: that workaround was removing USB support from
 my kernel. I did it and re(4) wasn't sharing interrupts 
 anylonger,
 and the interrupt storms were gone. Now sometime later the 
 interface
 goes up and down from time to time, but less often. Also 
 sometimes
 the machine losts the network interface but continues to work.
 
   
   It seems that your controller supports MSI so you can set a tunable
   hw.re.msi_disable to 0 to enable MSI. With MSI you can remove
   interrupt sharing(e.g. add hw.re.msi_disable=0 to
   /boot/loader.conf file.) However there were several issues on re(4)
   w.r.t MSI so it was off by default.
  
  This is undocumented and with sysctl -a i can't find the tunable. Is 
 this
  a HEAD feature or it's also in 7.1 -BETA2? Should i add

Yeah it's an undocmented feature. But most drivers written by me
have similar kobs. Both HEAD and stable/7 including 7.1 BETA2 have
the tunable.
   
   I think it could be great if you could document it or at least
   show it by default when you do sysctl -ad with a small description.
   
 
 If MSI worked as expected I would have documented it as I did
 in msk(4)/nfe(4)/ale(4)/age(4)/jme(4) etc.
 Using MSI on RealTek does not seem to stable. I tried hard to fix
 that but some users still reported watchdog timeouts. Working
 without documentation and hardware also made it hard to complete
 the work. This was the main reason why MSI was disabled on re(4).

What do you think about adding a note in the man page telling that
it's experimental and in some cases it could improve the situation
but in others it will give errors? 

 

  hw.re_msi_disable=0 to /boot/loader.conf?
   ^
   Shoule be hw.re.msi_disable=0
  

Yes, just add it to /boot/loader.conf. Note, you should not disable
system-wide MSI control(e.g. hw.pci.enable_msi == 1).

  This was sharing interrupt with USB, does USB need any special MSI 
 handling
  or with re using MSI is enough to not share the interrupt?

If re(4) can use MSI, you don't need to worry about interrupt
sharing with USB. Check the output of vmstat -i. You normally get
an irq256 or higher for MSI enabled driver.

  
  
   
 I know it continues to work because some days later i can see 
 that
 it tried to deliver the status reports but was unable to resolve 
 the
 aliases hostnames. I can't ping the machine and i know the 
 network
 is OK. If i reboot the machine everything is working again.
 
   
   Recently I've made small changes to re(4) which may help to detect
   link state change event. Would you try re(4) in HEAD?
  
  Can i just drop HEAD's /stable/7/sys/dev/re/ in -STABLE and test that

Yes, you can. It should build without problems. Just replace re(4) on
stable/7 with HEAD version.

  or do i need to test the whole HEAD kernel?
  

No you don't have to that.
   
   Backporting the changes i've found that it didn't compile so in
   the end i got from HEAD the following files:
   
   base/head/sys/dev/re/if_re.c
   base/head/sys/pci/if_rl.c
   base/head/sys/pci/if_rlreg.h
   
 
 Ah,, sorry about that. Recently there was some changes. I forgot
 that.
 
   After that i've recompiled 7.1 -BETA2 GENERIC kernel and enabled
   the knob you suggested in /boot/loader.conf.
   
   With 

Re: [ATA] and re(4) stability issues

2008-12-10 Thread Oliver Peter
On Wed, Dec 10, 2008 at 03:01:30PM +0100, Victor Balada Diaz wrote:
 On Wed, Dec 10, 2008 at 12:08:40PM +, Oliver Peter wrote:
  On Tue, Dec 09, 2008 at 07:52:37PM +0100, Victor Balada Diaz wrote:
...
 I can use GENERIC kernel again (ie, USB enabled) and so far
 i didn't find any problem yet. No more interface up/down problems
 and no more interrupt storms. I must say that i haven't tested
 this enough, because the interrupt storms in ATA code start to
 happen after a few days of uptime load, but at least the problems
 with the realtek seem to be gone. 

I found out that I'm able to 'force' the interrupt storm by provoking
higher disk I/O.  Just let dd write to a file in a loop for some hours
and watch vmstat:

while true; do dd if=/dev/zero of=BLA bs=1M count=1000; done

First you'll see that the throughput will decrease, and a few
hours later you'll have /var/log/messages / dmesg full of
interrupt storm messages.
 
 If you upgrade to 7.1 -BETA2 you'll also get SATA support for
 the IXP card. With 7.0 it will work as ATA 33 in compatibility mode.

Wow!  That's good to hear as well.  I'll definitely switch to
-STABLE or 7.1-PRERELASE sooner or later.  I'll just give it a try
on my other machines at first.
 
 I hope this helps you.

Absolutely, cheers mate.  I owe you one!

~ollie

-- 
Oliver PETER, email: [EMAIL PROTECTED], ICQ# 113969174
If it feels good, you're doing something wrong.
  -- Coach McTavish
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [ATA] and re(4) stability issues

2008-12-10 Thread Victor Balada Diaz
On Wed, Dec 10, 2008 at 01:18:00PM +0100, Arnaud Houdelette wrote:
 Victor Balada Diaz a écrit :
 Hello,
 
 I got various machines[1] at hetzner.de and I've been having problems
 with interrupts on FreeBSD 7.0 and now FreeBSD 7.1 -BETA2 in amd64. I've
 been trying to narrow the problem so someone more knowledgeable than me
 is able to fix it. This mail is an other attempt to ask a question
 with regards ATA code to see if this time i got something.
 
 [...] 
 
 Sorry I didn't take the time to read all the thread, but I got similar 
 problem with the same IXP600 chipset.
 Only it was'nt with a Realtek NIC (re) but with a Ralink wireless one. 
 The simptoms where similar : interrupt 22 was shared between the sata 
 controler and the wireless card. And I got Interrupt Storms at random 
 times when using the wireless network.
 
 No problem since I removed the ral(4) NIC (got a real access point now).
 You might not want to point the finger at the re(4) driver too fast.
 
 Arnaud Houdelette
Hello Arnaud,

I didn't say the problem was just because of re(4). Actually i think the
there were two problems, one with re(4) and other with ata(4). The reason
why i talked about both of them in the same mail is because i thought
that as two drivers were affected, maybe the problem was in other part
of the operating system and that could help the developers to debug the
problem.

My re(4) card isn't sharing the interrupt with IXP600, it's sharing
the interrupt with USB controller. In this case i think the problem
is fixed with the advices from Pyun YongHyeon (backporting the driver
from HEAD and using MSI for interrupts).

I think the problems with ata(4) code will appear again after a few
days of load, as they always do, so i'll keep trying to debug them.

Regards.

-- 
La prueba más fehaciente de que existe vida inteligente en otros
planetas, es que no han intentado contactar con nosotros. 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [ATA] and re(4) stability issues

2008-12-10 Thread Peter Jeremy
On 2008-Dec-10 10:55:35 +0100, Søren Schmidt [EMAIL PROTECTED] wrote:
And you will not use 64bit DMA even if the chipset supports it.  
However I have not seen any chipsets supporting this fail, YMMV as  
usual :)

There's a reference in wikipedia pointing to
http://www.mail-archive.com/[EMAIL PROTECTED]/msg06694.html
that claims the AMD/ATI SB600 lies about supporting 64-bit DMA in AHCI
mode.  I have a SB600 but it doesn't have 4GB to test on.

-- 
Peter Jeremy
Please excuse any delays as the result of my ISP's inability to implement
an MTA that is either RFC2821-compliant or matches their claimed behaviour.


pgp1ifE19lUGB.pgp
Description: PGP signature


Re: [ATA] and re(4) stability issues

2008-12-10 Thread Pyun YongHyeon
On Wed, Dec 10, 2008 at 03:08:24PM +0100, Victor Balada Diaz wrote:
  On Wed, Dec 10, 2008 at 09:07:19PM +0900, Pyun YongHyeon wrote:

[...]

 It seems that your controller supports MSI so you can set a 
   tunable
 hw.re.msi_disable to 0 to enable MSI. With MSI you can remove
 interrupt sharing(e.g. add hw.re.msi_disable=0 to
 /boot/loader.conf file.) However there were several issues on 
   re(4)
 w.r.t MSI so it was off by default.

This is undocumented and with sysctl -a i can't find the tunable. 
   Is this
a HEAD feature or it's also in 7.1 -BETA2? Should i add
  
  Yeah it's an undocmented feature. But most drivers written by me
  have similar kobs. Both HEAD and stable/7 including 7.1 BETA2 have
  the tunable.
 
 I think it could be great if you could document it or at least
 show it by default when you do sysctl -ad with a small description.
 
   
   If MSI worked as expected I would have documented it as I did
   in msk(4)/nfe(4)/ale(4)/age(4)/jme(4) etc.
   Using MSI on RealTek does not seem to stable. I tried hard to fix
   that but some users still reported watchdog timeouts. Working
   without documentation and hardware also made it hard to complete
   the work. This was the main reason why MSI was disabled on re(4).
  
  What do you think about adding a note in the man page telling that
  it's experimental and in some cases it could improve the situation
  but in others it will give errors? 

Based on the your testing I have idea how to mitigate the missing
Tx completion interrupt. If all goes well re(4) could reliably take
advantage of MSI on RealTek controllers. If that miserably fail I
would do as you suggested.

   
   I think re(4) in HEAD needs more testing. As you might know RealTek
   produced too many chipsets. :-(
  
  Ok, i'll use the backported driver as it works better for me :-)
  
  If i can help you testing any patches i'm more than welcome to do it.
  
  Thanks a lot for your help Pyun YongHyeon.
  

You're welcome.
-- 
Regards,
Pyun YongHyeon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [ATA] and re(4) stability issues

2008-12-10 Thread Victor Balada Diaz
On Wed, Dec 10, 2008 at 09:07:19PM +0900, Pyun YongHyeon wrote:
 On Wed, Dec 10, 2008 at 12:32:25PM +0100, Victor Balada Diaz wrote:
   Also i didn't see any problem with interfaces going up and down,
   but that usually happen after some hours of uptime, so i'll let
   you know if the error happens again.
   

After writing to the HD with dd for a few hours and using
stress -i 10 -d 10 the machine lost connectivity. I waited until
today to be sure if the machine hung, paniced or just lost network
connectivity. I don't have local access or serial access, so this
is the only way i could do it. I've seen in the logs during the
night various messages of:


Dec 10 00:33:49 yac kernel: re0: watchdog timeout
Dec 10 00:33:49 yac kernel: re0: link state changed to DOWN
Dec 10 00:33:52 yac kernel: re0: link state changed to UP

The interface never recovered and i wasn't able to ping the machine
until i rebooted. Nagios was checking all the time and no recovery
happened.

The netstat -i in daily scripts shows just one Oerrs. I'm used to
have a lot of them, but seems this time the card didn't recover from
the only one. I also want to say that this is not a regression, as
it happened before with 7.1 -BETA 2 code.

Is there anything more i can try?

Regards.
-- 
La prueba más fehaciente de que existe vida inteligente en otros
planetas, es que no han intentado contactar con nosotros. 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


[ATA] and re(4) stability issues

2008-12-09 Thread Victor Balada Diaz
Hello,

I got various machines[1] at hetzner.de and I've been having problems
with interrupts on FreeBSD 7.0 and now FreeBSD 7.1 -BETA2 in amd64. I've
been trying to narrow the problem so someone more knowledgeable than me
is able to fix it. This mail is an other attempt to ask a question
with regards ATA code to see if this time i got something.

For the ones that don't actually know what happened:

With FreeBSD 7.0 -RELEASE for amd64 and default kernel
the system shared re0 interrupt with OHCI and this caused
re(4) to corrupt packets and create interrupt storms. Tried
updating to 7.1 -BETA2 and still had some problems with it.

I've opened the PR kern/128287[2] and Remko quickly answered
with a workaround: that workaround was removing USB support from
my kernel. I did it and re(4) wasn't sharing interrupts anylonger,
and the interrupt storms were gone. Now sometime later the interface
goes up and down from time to time, but less often. Also sometimes
the machine losts the network interface but continues to work.

I know it continues to work because some days later i can see that
it tried to deliver the status reports but was unable to resolve the
aliases hostnames. I can't ping the machine and i know the network
is OK. If i reboot the machine everything is working again.

When switched from 7.0 to 7.1 BETA2 i also found that under load
after some hours the machine created interrupt storms on ATA disks.

Digging at linux source code i've found that they do some special things
for this chipset that i've been unable to find on our code. This is
linux code for my chipset:

371 AHCI_HFLAGS (AHCI_HFLAG_IGN_SERR_INTERNAL |
372  AHCI_HFLAG_32BIT_ONLY | AHCI_HFLAG_NO_MSI |
373  AHCI_HFLAG_SECT255),

File and the rest of the code in here[3].

As i saw AHCI_HFLAG_NO_MSI i tried doing the easiest thing i could
think of, switching MSI and MSI-x off for the whole system, so
i added to /boot/loader.conf this tunables:

hw.pci.enable_msix=0
hw.pci.enable_msi=0

And then rebooted the machine. After various hours of doing almost nothing
i've found that the machine answered ping but was unable to answer any
request (eg, ssh, nagios nrpe, etc). The machine recovered itself after
some minutes and when i was able to ssh into i saw the following in dmesg:

ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing 
request directly
ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing 
request directly
ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request 
directly
ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request 
directly
ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly
ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=1463123158

and a lot more errors like that. I didn't get this errors with MSI enabled.
I see WRITE_DMA48 and in linux code i saw AHCI_HFLAG_32BIT_ONLY which is later
used for DMA related things. Could someone who is more knowledgeable check
if we're doing the right thing?

I've attached verbose dmesg of a machine that's like this one with
7.1 -BETA2, MSI enabled and GENERIC kernel minus USB and firewrire.

Also, please, could someone give me a hand on how could i continue debugging
this interrupt issues? I'm a bit lost and digging code and posting each
time i think i've found something is not going to go anywhere.

I would also like to say that i've seen reports of this kind of problems
on amd64 machines in the lists since various years ago, so i don't think
this is just a problem with this BIOS/motherboard (MSI K9AG Neo2 Digital)
on the lists


Thanks in advance for any help.
Regards.


[1]: http://www.hetzner.de/hosting/produkte_rootserver/ds7000/
[2]: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/128287
[3]: http://fxr.watson.org/fxr/source/drivers/ata/ahci.c?v=linux-2.6#L369
-- 
La prueba más fehaciente de que existe vida inteligente en otros
planetas, es que no han intentado contactar con nosotros. 
Copyright (c) 1992-2008 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 7.1-BETA2 #1: Wed Oct 22 13:19:14 CEST 2008
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/NOUSB
Preloaded elf kernel /boot/kernel/kernel at 0x80c12000.
Preloaded elf obj module /boot/kernel/geom_mirror.ko at 0x80c121a8.
Preloaded elf obj module /boot/kernel/accf_data.ko at 0x80c12818.
Preloaded elf obj module /boot/kernel/accf_http.ko at 0x80c12cc8.
Preloaded elf obj module /boot/kernel/k8temp.ko at 0x80c13238.
Preloaded elf obj module /boot/kernel/geom_journal.ko at 0x80c13720.
Calibrating clock(s) ... i8254 clock: 1193242 Hz
CLK_USE_I8254_CALIBRATION not specified - using default frequency

Re: [ATA] and re(4) stability issues

2008-12-09 Thread Pyun YongHyeon
On Tue, Dec 09, 2008 at 07:52:37PM +0100, Victor Balada Diaz wrote:
  Hello,
  
  I got various machines[1] at hetzner.de and I've been having problems
  with interrupts on FreeBSD 7.0 and now FreeBSD 7.1 -BETA2 in amd64. I've
  been trying to narrow the problem so someone more knowledgeable than me
  is able to fix it. This mail is an other attempt to ask a question
  with regards ATA code to see if this time i got something.
  
  For the ones that don't actually know what happened:
  
  With FreeBSD 7.0 -RELEASE for amd64 and default kernel
  the system shared re0 interrupt with OHCI and this caused
  re(4) to corrupt packets and create interrupt storms. Tried

re(4) in 7.0-RELEASE had bus_dma(9) bug which could be easily
triggered on systems with  4GB memory. But I dont' know whether
this is related with interrupt storms.

  updating to 7.1 -BETA2 and still had some problems with it.
  
  I've opened the PR kern/128287[2] and Remko quickly answered
  with a workaround: that workaround was removing USB support from
  my kernel. I did it and re(4) wasn't sharing interrupts anylonger,
  and the interrupt storms were gone. Now sometime later the interface
  goes up and down from time to time, but less often. Also sometimes
  the machine losts the network interface but continues to work.
  

It seems that your controller supports MSI so you can set a tunable
hw.re.msi_disable to 0 to enable MSI. With MSI you can remove
interrupt sharing(e.g. add hw.re.msi_disable=0 to
/boot/loader.conf file.) However there were several issues on re(4)
w.r.t MSI so it was off by default.

  I know it continues to work because some days later i can see that
  it tried to deliver the status reports but was unable to resolve the
  aliases hostnames. I can't ping the machine and i know the network
  is OK. If i reboot the machine everything is working again.
  

Recently I've made small changes to re(4) which may help to detect
link state change event. Would you try re(4) in HEAD?

-- 
Regards,
Pyun YongHyeon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Snaphot stability issues on 6.3

2008-08-26 Thread Michael R. Wayne
One of our servers, running a bunch of jails, has issues when doing
nightly dumps only if snapshots are enabled.  This box was running
5.X and has been upgraded over time to 6.3.  When running 5.X, we
attempted to use snapshots on dump (-L) which resulted in almost
nightly system hangs during the dump.

We ran 6.X for months with no stability issues, backing up nightly
w/o snapshots.  Took the system down to single user mode, did
foreground fsck, enabled -L on dumps and the machine kinda hangs
twice a week (main host seems OK but jails stop responding and can
not be properly stopped), requiring a reset.  Removing the snapshots
restores system stability.  Foreground fsck finds nothing unusual,
just what I would expect when doing a reset on a live filesystem.

I suspect that there is some corrupt filesystem residue from 5.X
since we have no similar issues on 6.x clean installs.  Is there
something better than just fsck to attempt to resolve these issues?

/\/\ \/\/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


6.3 stability and freebsd-update (was: Re: challenge: end of life for 6.2 is premature with buggy 6.3)

2008-06-06 Thread Royce Williams
 On Jun 4, 2008, at 4:43 PM, Clifton Royston wrote:

  Speaking just for myself, I'd love to get a general response from
 people who have run servers on both as to whether 6.3 is on average
 more stable than 6.2.  I really haven't gotten any clear impression as

6.3 has been stable for me.  I've been running since April on DL380
G2, Dell 2450, Supermicro 5015M-MF and 5015T-PR, and some older Intel
815E boards.  My bge(4) NICs are detected as Broadcom BCM5703 A2 and
BCM5704 A2, with no issues.  Running gmirror with no issues.

Someone mentioned freebsd-update earlier in the thread.  I'd like to
take a moment to plug it, since making it easy to move to 6.3 seems
topical.  I got a little long-winded, so here's an executive summary:

freebsd-update is good; business case for more hardware; updating in
'hybrid' mode with custom kernels and stock userland; using kernel
config 'includes' to save additional effort.


I prefer freebsd-update over the buildworld and then
installworld-over-NFS routine, centralized rsyncs, or anything else.
I used freebsd-update to uplift the systems above, and also just
bumped sixteen more from 6.2.  Worked like a charm.  Its 'rollback'
option is also very handy, for obvious reasons.

Based on how much time I save with freebsd-update, I can make a
business case for buying another box for a farm, rather than rolling
my own kernels and eking out xx% of additional performance.  Once ULE
gets into 7.x-RELEASE, I probably won't even have to do that.

For systems that require a custom kernel, we still patch everything
else with freebsd-update.  When freebsd-update detects the non-stock
kernel, it warns you to install a kernel from the target release.  If
that scares you, you can swap in a stock kernel from the current
release (saved off, or from the release media or FTP) and then
upgrade.  When finished, save off the new stock kernel for future
upgrades, and then rebuild your custom kernel.  (Anybody else doing
anything like this, or something better?)

I also recommend starting your kernel config with 'include SMP' (or
GENERIC or PAE or whatever).  If you use nocpu, nooptions,
nomakeoptions, nodevice etc. to turn off what you don't need, your
config is reduced to a 'diff' of sorts against the stock config.  Our
kernel configs are now ~17 lines, can be grokked at a glance, and
should change little from release to release.  Here's a stub of an
example that uses most of the knobs:

include SMP# Inherit SMP (which inherits GENERIC).

nocpu   I486_CPU   # Disable old CPU support; see tuning(7).
nocpu   I586_CPU
ident   SMP-GENERIC_CUSTOM_FOO  # Inherit ident, custom name.

nomakeoptionDEBUG  # Do not build with gdb(1) debug symbols.

nooptions   SCHED_4BSD # Do not use the 4BSD scheduler;
options SCHED_ULE  #   use ULE schedule instead.

# ALTQ support
options ALTQ
options ALTQ_CBQ   # Class Bases Queuing (CBQ)
options ALTQ_RED   # Random Early Detection (RED)
options ALTQ_RIO   # RED In/Out
options ALTQ_HFSC  # Hierarchical Packet Scheduler (HFSC)
options ALTQ_PRIQ  # Priority Queuing (PRIQ)
options ALTQ_NOPCC # Required for SMP build

# Devices for pf
device  pf # PF
device  pflog  # pflog
device  pfsync # pfsync


Use 'nodevice' to turn off devices worth leaving out, but only as many
as are worth the effort.

If you haven't already considered freebsd-update, either for the whole
system or just userland, I highly recommend it.  These days, the gain
has to be pretty significant for me to want to go back to making
world.  For our PXE installs using custom install.cfg, we can go from
bare metal to a fully patched vanilla system in four minutes on modern
hardware!  The novelty of that still hasn't worn off. :-)

It's more efficient (and probably safer?) to use freebsd-update
against a binary install rather than against local compilation.  And
if you're bumping major versions (6.x - 7.x), you still have to
rebuild your ports.  But try it in your lab or for your next
deployment.  You can easily convert a freebsd-updated system to a
makeworld system, if necessary.

And thanks again, Colin!

Royce

-- 
Royce D. Williams   - http://royce.ws/
  Inspiration exists, but it has to find us working. - Pablo Picasso
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 6.3 stability and freebsd-update (was: Re: challenge: end of life for 6.2 is premature with buggy 6.3)

2008-06-06 Thread Brooks Davis
On Fri, Jun 06, 2008 at 11:34:01AM -0800, Royce Williams wrote:
  On Jun 4, 2008, at 4:43 PM, Clifton Royston wrote:
 
   Speaking just for myself, I'd love to get a general response from
  people who have run servers on both as to whether 6.3 is on average
  more stable than 6.2.  I really haven't gotten any clear impression as
 
 6.3 has been stable for me.  I've been running since April on DL380
 G2, Dell 2450, Supermicro 5015M-MF and 5015T-PR, and some older Intel
 815E boards.  My bge(4) NICs are detected as Broadcom BCM5703 A2 and
 BCM5704 A2, with no issues.  Running gmirror with no issues.
 
 Someone mentioned freebsd-update earlier in the thread.  I'd like to
 take a moment to plug it, since making it easy to move to 6.3 seems
 topical.  I got a little long-winded, so here's an executive summary:
 
 freebsd-update is good; business case for more hardware; updating in
 'hybrid' mode with custom kernels and stock userland; using kernel
 config 'includes' to save additional effort.
 
 
 I prefer freebsd-update over the buildworld and then
 installworld-over-NFS routine, centralized rsyncs, or anything else.
 I used freebsd-update to uplift the systems above, and also just
 bumped sixteen more from 6.2.  Worked like a charm.  Its 'rollback'
 option is also very handy, for obvious reasons.
 
 Based on how much time I save with freebsd-update, I can make a
 business case for buying another box for a farm, rather than rolling
 my own kernels and eking out xx% of additional performance.  Once ULE
 gets into 7.x-RELEASE, I probably won't even have to do that.
 
 For systems that require a custom kernel, we still patch everything
 else with freebsd-update.  When freebsd-update detects the non-stock
 kernel, it warns you to install a kernel from the target release.  If
 that scares you, you can swap in a stock kernel from the current
 release (saved off, or from the release media or FTP) and then
 upgrade.  When finished, save off the new stock kernel for future
 upgrades, and then rebuild your custom kernel.  (Anybody else doing
 anything like this, or something better?)

Alternativly, you can edit freebsd-update.conf and tell it to not update your
kernel on those machines.

You can also exclude particular files.  We use this to keep from
updating libc directly on some machines where we're using modified RPC
timings to improve NIS performance in the face of occataionl packet
loss.

-- Brooks

 GENERIC or PAE or whatever).  If you use nocpu, nooptions,
 nomakeoptions, nodevice etc. to turn off what you don't need, your
 config is reduced to a 'diff' of sorts against the stock config.  Our
 kernel configs are now ~17 lines, can be grokked at a glance, and
 should change little from release to release.  Here's a stub of an
 example that uses most of the knobs:
 
 include SMP# Inherit SMP (which inherits GENERIC).
 
 nocpu   I486_CPU   # Disable old CPU support; see tuning(7).
 nocpu   I586_CPU
 ident   SMP-GENERIC_CUSTOM_FOO  # Inherit ident, custom name.
 
 nomakeoptionDEBUG  # Do not build with gdb(1) debug symbols.
 
 nooptions   SCHED_4BSD # Do not use the 4BSD scheduler;
 options SCHED_ULE  #   use ULE schedule instead.
 
 # ALTQ support
 options ALTQ
 options ALTQ_CBQ   # Class Bases Queuing (CBQ)
 options ALTQ_RED   # Random Early Detection (RED)
 options ALTQ_RIO   # RED In/Out
 options ALTQ_HFSC  # Hierarchical Packet Scheduler (HFSC)
 options ALTQ_PRIQ  # Priority Queuing (PRIQ)
 options ALTQ_NOPCC # Required for SMP build
 
 # Devices for pf
 device  pf # PF
 device  pflog  # pflog
 device  pfsync # pfsync
 
 
 Use 'nodevice' to turn off devices worth leaving out, but only as many
 as are worth the effort.
 
 If you haven't already considered freebsd-update, either for the whole
 system or just userland, I highly recommend it.  These days, the gain
 has to be pretty significant for me to want to go back to making
 world.  For our PXE installs using custom install.cfg, we can go from
 bare metal to a fully patched vanilla system in four minutes on modern
 hardware!  The novelty of that still hasn't worn off. :-)
 
 It's more efficient (and probably safer?) to use freebsd-update
 against a binary install rather than against local compilation.  And
 if you're bumping major versions (6.x - 7.x), you still have to
 rebuild your ports.  But try it in your lab or for your next
 deployment.  You can easily convert a freebsd-updated system to a
 makeworld system, if necessary.
 
 And thanks again, Colin!
 
 Royce
 
 -- 
 Royce D. Williams   - http://royce.ws/
   Inspiration exists, but it has to find us working. - Pablo Picasso
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL 

Re: FreeBSD 6.3-stable and if_re - stability problems?

2008-01-28 Thread Pyun YongHyeon
On Sun, Jan 27, 2008 at 01:48:36PM +0100, Torfinn Ingolfsen wrote:
  Hello!
  
  Is anybody having stability problems with if_re under FreeBSD
  6.3-stable?
  I know about PR kern/118719[1] but it doesn't look like the problem I'm
  having - at least my machine doesn't panic.
  My machine[2] runs FreeBSD 6.3-stable / amd64:
  [EMAIL PROTECTED] uname -a
  FreeBSD kg-vm.kg4.no 6.3-STABLE FreeBSD 6.3-STABLE #1: Sun Jan 27 02:10:15 
  CET 2008 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SMP  amd64
  
  The machine has an if_re interface:
  [EMAIL PROTECTED] pciconf -lv | grep -B 4 network
  subclass   = VGA
  [EMAIL PROTECTED]:0:0:   class=0x02 card=0x81aa1043 chip=0x816810ec 
  rev=0x01 hdr=0x00
  vendor = 'Realtek Semiconductor'
  device = 'RTL8168/8111 PCI-E Gigabit Ethernet NIC'
  class  = network
  
  It looks like the interface works, because it gets the ip
  address and other dhcp parameters right. But when I try to pass traffic
  over it, trsffic often just stops. Examples: ping by ip address,
  csup'ing the ports. also ssh connections to the machine randomly hangs, or 
  closes.
  

I think re(4) in CURRENT have fixed these issues.
Would you try re(4) in CURRENT?

  Ouch! The machine just rebooted. Perhaps this is kern/118719 after all.
  Anything I can do to diagnose this problem further?
  FWIW, I have FreeBSD 7.0 (RELENG_7) installed on another slice and that 
  works nicely.
  
  References:
  1) PR kern/118719  http://www.freebsd.org/cgi/query-pr.cgi?pr=118719
  2) my machine http://tingox.googlepages.com/asus_m2a-vm_hdmi_freebsd
  -- 
  Regards, 
  Torfinn Ingolfsen
  

-- 
Regards,
Pyun YongHyeon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 6.3-stable and if_re - stability problems?

2008-01-28 Thread Torfinn Ingolfsen
On Mon, 28 Jan 2008 17:22:16 +0900
Pyun YongHyeon [EMAIL PROTECTED] wrote:

 I think re(4) in CURRENT have fixed these issues.
 Would you try re(4) in CURRENT?

Perhaps I was being unclear; under FreeBSD 7.x (RELENG_7) re(49 works
fine.
It is only .3 that has problems with re(4).
-- 
Regards,
Torfinn Ingolfsen

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 6.3-stable and if_re - stability problems?

2008-01-28 Thread Pyun YongHyeon
On Mon, Jan 28, 2008 at 09:43:12PM +0100, Torfinn Ingolfsen wrote:
  On Mon, 28 Jan 2008 17:22:16 +0900
  Pyun YongHyeon [EMAIL PROTECTED] wrote:
  
   I think re(4) in CURRENT have fixed these issues.
   Would you try re(4) in CURRENT?
  
  Perhaps I was being unclear; under FreeBSD 7.x (RELENG_7) re(49 works
  fine.
  It is only .3 that has problems with re(4).

Ah, ok. I'll make sure to MFC the changes to RELENG_6 after more
testing.

  -- 
  Regards,
  Torfinn Ingolfsen
  

-- 
Regards,
Pyun YongHyeon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


FreeBSD 6.3-stable and if_re - stability problems?

2008-01-27 Thread Torfinn Ingolfsen
Hello!

Is anybody having stability problems with if_re under FreeBSD
6.3-stable?
I know about PR kern/118719[1] but it doesn't look like the problem I'm
having - at least my machine doesn't panic.
My machine[2] runs FreeBSD 6.3-stable / amd64:
[EMAIL PROTECTED] uname -a
FreeBSD kg-vm.kg4.no 6.3-STABLE FreeBSD 6.3-STABLE #1: Sun Jan 27 02:10:15 CET 
2008 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SMP  amd64

The machine has an if_re interface:
[EMAIL PROTECTED] pciconf -lv | grep -B 4 network
subclass   = VGA
[EMAIL PROTECTED]:0:0:   class=0x02 card=0x81aa1043 chip=0x816810ec 
rev=0x01 hdr=0x00
vendor = 'Realtek Semiconductor'
device = 'RTL8168/8111 PCI-E Gigabit Ethernet NIC'
class  = network

It looks like the interface works, because it gets the ip
address and other dhcp parameters right. But when I try to pass traffic
over it, trsffic often just stops. Examples: ping by ip address,
csup'ing the ports. also ssh connections to the machine randomly hangs, or 
closes.

Ouch! The machine just rebooted. Perhaps this is kern/118719 after all.
Anything I can do to diagnose this problem further?
FWIW, I have FreeBSD 7.0 (RELENG_7) installed on another slice and that works 
nicely.

References:
1) PR kern/118719  http://www.freebsd.org/cgi/query-pr.cgi?pr=118719
2) my machine http://tingox.googlepages.com/asus_m2a-vm_hdmi_freebsd
-- 
Regards, 
Torfinn Ingolfsen

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 6.3-stable and if_re - stability problems?

2008-01-27 Thread Torfinn Ingolfsen
On Sun, 27 Jan 2008 13:48:36 +0100
Torfinn Ingolfsen [EMAIL PROTECTED] wrote:

 Ouch! The machine just rebooted. Perhaps this is kern/118719 after
 all. Anything I can do to diagnose this problem further?

forget about the reboot - it was caused by my attempt at a  workaround
(using a if_ural interface connectec to a usb port. This will reliably
panic the machine after some amount of data has flowed through the
interface).
-- 
Regards,
Torfinn Ingolfsen

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: EM stability

2006-11-13 Thread Gleb Smirnoff
On Sun, Nov 12, 2006 at 02:26:36PM -0600, Barry Boes wrote:
B After the last hang I added giant locks back in and the machine has
B been up since.
B 
B I don't have a serial console, just a graphic console.  When the
B machine hangs it stops replying to ethernet packets at all protocol
B levels and doesn't respond to keyboard input in any way, virtual
B console or otherwise.  If I run a script of the form
Bwhile(1)
B  sleep 1
B  date  datelog
Bend
B 
B the file stops updating when the machine hangs.
B 
B I will define the debugger in the kernel (options DDB, right?), attach
B a serial console, and do what I can to get more information on the
B problem.

Yes, this looks like something is running in an endless loop. Once
you compile kernel with debugger, you should enter in several times
and see the backtraces. Usually, they will be inside this cycle.

-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: EM stability

2006-11-12 Thread Barry Boes

After the last hang I added giant locks back in and the machine has
been up since.

I don't have a serial console, just a graphic console.  When the
machine hangs it stops replying to ethernet packets at all protocol
levels and doesn't respond to keyboard input in any way, virtual
console or otherwise.  If I run a script of the form
   while(1)
 sleep 1
 date  datelog
   end

the file stops updating when the machine hangs.

I will define the debugger in the kernel (options DDB, right?), attach
a serial console, and do what I can to get more information on the
problem.

-Barry


Jack Vogel writes:
  On 11/10/06, Barry Boes [EMAIL PROTECTED] wrote:
  
   Luck ran out.  Hard must press the reset button hang.  No console
   messages.   The system was idle at the time.
  Is there anything you'd like me to do to attempt to narrow down the
   problem or get debugging output?  I do not know if the freeze was
   related to em or something else.
  
  Is this a machine running some graphic head? If not can you see anything
  on the console? Are you sure the machine is dead, like can you get in
  over the network... ? One thing I often do when you are dealing with
  unpredictable hangs is run 'vmstat 3' on one of the virtual terminals.
  
  You might also define the kernel debugger into your kernel, its best to have
  a serial console for this, I've seen the hardware console be locked but the
  serial will still work.
  
  The only way we will track this down is thru repetitive reproduction I'm
  afraid.
  
  Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: EM stability

2006-11-10 Thread Gleb Smirnoff
  Hello Barry,

On Fri, Nov 10, 2006 at 08:56:30AM -0600, Barry Boes wrote:
BI see you listed on the EM stability issues list.  I have a Tyan
B H1000S with dual em ports on 6.1, and it won't stay up 5 minutes
B without EM watchdog resets unless I use giant locks.
BIs there any way you'd like me to help you with testing the updated
B drivers?

Yes, please upgrade to the latemost RELENG_6 via cvsup, build a new
kernel and report whether the problem is fixed or not.

You see, I have added a o lot of people and two mailing lists to Cc.
Please do not remove them, when replying. Thanks!

-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: EM stability

2006-11-10 Thread Barry Boes

So far so good.  I updated to the latest, including jfv's revision
1.65.2.21 from this AM.

With the 6.1 ISO distribution, I would get watchdogs within seconds of
starting a file transfer (except giant locked which worked fine).

With RELENG_6 I've transfered 100's of GB via ftp and NFS over both
ethernet ports and no problems yet.

Thanks for all the hard work!
Barry



Gleb Smirnoff writes:
Hello Barry,
  
  On Fri, Nov 10, 2006 at 08:56:30AM -0600, Barry Boes wrote:
  BI see you listed on the EM stability issues list.  I have a Tyan
  B H1000S with dual em ports on 6.1, and it won't stay up 5 minutes
  B without EM watchdog resets unless I use giant locks.
  BIs there any way you'd like me to help you with testing the updated
  B drivers?
  
  Yes, please upgrade to the latemost RELENG_6 via cvsup, build a new
  kernel and report whether the problem is fixed or not.
  
  You see, I have added a o lot of people and two mailing lists to Cc.
  Please do not remove them, when replying. Thanks!
  
  -- 
  Totus tuus, Glebius.
  GLEBIUS-RIPN GLEB-RIPE
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: EM stability

2006-11-10 Thread Barry Boes

Luck ran out.  Hard must press the reset button hang.  No console
messages.   The system was idle at the time.
   Is there anything you'd like me to do to attempt to narrow down the
problem or get debugging output?  I do not know if the freeze was
related to em or something else.

-Barry


Barry Boes writes:
  
  So far so good.  I updated to the latest, including jfv's revision
  1.65.2.21 from this AM.
  
  With the 6.1 ISO distribution, I would get watchdogs within seconds of
  starting a file transfer (except giant locked which worked fine).
  
  With RELENG_6 I've transfered 100's of GB via ftp and NFS over both
  ethernet ports and no problems yet.
  
  Thanks for all the hard work!
  Barry
  
  
  
  Gleb Smirnoff writes:
  Hello Barry,

On Fri, Nov 10, 2006 at 08:56:30AM -0600, Barry Boes wrote:
BI see you listed on the EM stability issues list.  I have a Tyan
B H1000S with dual em ports on 6.1, and it won't stay up 5 minutes
B without EM watchdog resets unless I use giant locks.
BIs there any way you'd like me to help you with testing the updated
B drivers?

Yes, please upgrade to the latemost RELENG_6 via cvsup, build a new
kernel and report whether the problem is fixed or not.

You see, I have added a o lot of people and two mailing lists to Cc.
Please do not remove them, when replying. Thanks!

-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: EM stability

2006-11-10 Thread Gleb Smirnoff
On Fri, Nov 10, 2006 at 04:28:30PM -0600, Barry Boes wrote:
B 
B Luck ran out.  Hard must press the reset button hang.  No console
B messages.   The system was idle at the time.
BIs there anything you'd like me to do to attempt to narrow down the
B problem or get debugging output?  I do not know if the freeze was
B related to em or something else.

In cases like this you need to prepare a kernel with debugger compiled
in and try to exit into the debugger, when the hang occurs. You can
try keyboard debugger sequence, and if it fails try serial break.

-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: EM stability

2006-11-10 Thread Jack Vogel

On 11/10/06, Barry Boes [EMAIL PROTECTED] wrote:


Luck ran out.  Hard must press the reset button hang.  No console
messages.   The system was idle at the time.
   Is there anything you'd like me to do to attempt to narrow down the
problem or get debugging output?  I do not know if the freeze was
related to em or something else.


Is this a machine running some graphic head? If not can you see anything
on the console? Are you sure the machine is dead, like can you get in
over the network... ? One thing I often do when you are dealing with
unpredictable hangs is run 'vmstat 3' on one of the virtual terminals.

You might also define the kernel debugger into your kernel, its best to have
a serial console for this, I've seen the hardware console be locked but the
serial will still work.

The only way we will track this down is thru repetitive reproduction I'm
afraid.

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


nfs/geli stability problems and file corruption

2006-08-17 Thread Chris

Hi on 3 different servers we had the same pronlem as below.

2 servers 6.1-STABLE and one 6.1-RELEASE security branch.

2 of the servers before hand were running without nfs and geli for
months stable, the 3rd was brand new.

We enabled geli encryption on loopback partitions and real partitions
to encrypt our data, we then mounted some nfs mounts fro debian
servers.

Things would run well for anything from a few days to a few weeks.
Then the server stops responding.

We reboot the server and it fails to boot with an error /etc/fstab
unable to mount filesystem.

The local tech was unable to proceed from there so the boxes were
formatted and data lost.

On one of the servers before it died I did notice nfs stale errors.

What has got me curious is (a) whats causing the crashes or network
hang since the boxes may well have still been responding locally and
(b) what was causing the filesystems not mounting upon reboot.  The
hardware is not the cause I am talking about 3 different servers 2 of
which were running fine for months nad have ran fine since debian has
been put on.

The first 2 servers arent mine and the owners had enough of the
problems, the 3rd server been mine I want to find a solution to run it
on freebsd, I have to pay for reformats so ideally dont want to do
trial and error and having to pay for a reinstall every time.  I did
some googling and it appears nfs on freebsd has its problems, I have
seen PRs sent for nfs not answered and some documents regarding
freebsd nfs having problems with other OS's nfs.

The crashes arent so bad the killer is the unable to mount on the
following reboot, could geli be causing this since this is relatvely
new, whilst gdbe is more established.  Can gdbe be used on loopback
filesystems?

Thanks

Chris
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Stability of ICH7 sata on FreeBSD 6.1 ?

2006-08-11 Thread Mike Jakubik

Dominic Marks wrote:

Jerome Sobecki wrote:

Hi all,

We have here some Supermicro Superserver 5015P-TR
(http://www.supermicro.com/products/system/1U/5015/SYS-5015P-TR.cfm)

Those servers, with a ICH7 controler, are currently working with FreeBSD
6.1 and everything seems ok, except that it's the third time, on two
different machines, that the system crash because it lost is hard drive.
  


I have two Supermicro PDSMi MB servers in productions and i am also 
experiencing mysterious disk loses. The system continues to function 
fine, as i am using gmirror, however something strange is going on. 
Sometimes the disks come back, sometimes i need to reboot the system to 
get them back. I am using Seagate ST3160812AS 3.AAE drives. The drives 
reports no SMART errors, and the cables are secure. The drives are 
attached to a hot swap back plane.



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Stability of ICH7 sata on FreeBSD 6.1 ?

2006-08-11 Thread Miroslav Lachman

Mike Jakubik wrote:


Dominic Marks wrote:


Jerome Sobecki wrote:


Hi all,

We have here some Supermicro Superserver 5015P-TR
(http://www.supermicro.com/products/system/1U/5015/SYS-5015P-TR.cfm)

Those servers, with a ICH7 controler, are currently working with FreeBSD
6.1 and everything seems ok, except that it's the third time, on two
different machines, that the system crash because it lost is hard drive.
  



I have two Supermicro PDSMi MB servers in productions and i am also 
experiencing mysterious disk loses. The system continues to function 
fine, as i am using gmirror, however something strange is going on. 
Sometimes the disks come back, sometimes i need to reboot the system to 
get them back. I am using Seagate ST3160812AS 3.AAE drives. The drives 
reports no SMART errors, and the cables are secure. The drives are 
attached to a hot swap back plane.


I have same problem on ASUS RS120 with Seagate ST3250820AS/3.AAC drives 
(disk loses, system reboots, slow read/write speed), but I think this is 
drive problem - all drives has high Reallocated_Sector_Ct value in SMART 
(above 130 reallocated sectors after few weeks, some drives has more 
then 100 after few days).
Please let me now, if you also have nonzero Reallocated_Sector_Ct in 
smartctl -A output.

I will test those servers with brand new Samsung drives, hope that it helps.

Miroslav Lachman
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Stability of ICH7 sata on FreeBSD 6.1 ?

2006-08-11 Thread Mike Jakubik

Miroslav Lachman wrote:
I have same problem on ASUS RS120 with Seagate ST3250820AS/3.AAC 
drives (disk loses, system reboots, slow read/write speed), but I 
think this is drive problem - all drives has high 
Reallocated_Sector_Ct value in SMART (above 130 reallocated sectors 
after few weeks, some drives has more then 100 after few days).
Please let me now, if you also have nonzero Reallocated_Sector_Ct in 
smartctl -A output.
I will test those servers with brand new Samsung drives, hope that it 
helps.


I don't think you have the same problem, in your case it sounds like a 
bad hard drive. On my servers, all the hard drives are less than two 
months old, and are completely error free according to SMART.



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Stability of ICH7 sata on FreeBSD 6.1 ?

2006-08-07 Thread Jerome Sobecki
Hi all,

We have here some Supermicro Superserver 5015P-TR
(http://www.supermicro.com/products/system/1U/5015/SYS-5015P-TR.cfm)

Those servers, with a ICH7 controler, are currently working with FreeBSD
6.1 and everything seems ok, except that it's the third time, on two
different machines, that the system crash because it lost is hard drive.


The logs we get on console during the last crash (ad4s1g is /var, so we
don't have any other logs):
g_vsf_done() :ad4s1g[WRITE(offset=35657547776, length=16384)]error = 6
[...]
g_vsf_done() :ad4s1g[WRITE(offset=35662495744, length=16384)]error = 6
g_vsf_done() :ad4s1g[READ(offset=23900815360, , length=16384)]error = 6
handle_workitem_freeblocks: block count

The logs we have in /var/log/message during another crash :
Jul 26 19:34:07 munster2 kernel: ad6: FAILURE - device detached
Jul 26 19:34:07 munster2 kernel: subdisk6: detached
Jul 26 19:34:07 munster2 kernel: ad6: detached

When the machine crash, the led of the lost DD is fixed on, and a soft
reboot doesn't allow to get the disk back : an electric shutdown is
necessary.

Before the crash, servers had more than 1 mounth of uptime in
production, and others are still ok...

information about the machine :

vieux-lille2% uname -v 
FreeBSD 6.1-STABLE #3: Thu Jun  8 12:47:45 CEST 2006
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC 

It's right it was not up to date, but I didn't see cvs commit about that
problem (but maybe I simply miss it)

Does anyone had that problem ? Do you think updating the system from 6.1
sources will be enought ?

I let you dmesg, if it could help... :
vieux-lille2% dmesg
Copyright (c) 1992-2006 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights
reserved.
FreeBSD 6.1-STABLE #3: Thu Jun  8 12:47:45 CEST 2006
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC
Timecounter i8254 frequency 1193182 Hz quality 0
CPU: Intel(R) Pentium(R) 4 CPU 3.40GHz (3400.15-MHz 686-class CPU)
  Origin = GenuineIntel  Id = 0xf43  Stepping = 3
  
Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
  Features2=0x649dSSE3,RSVD2,MON,DS_CPL,EST,CNTX-ID,CX16,b14
  AMD Features=0x2010NX,LM
  Logical CPUs per core: 2
real memory  = 1072562176 (1022 MB)
avail memory = 1040637952 (992 MB)
ACPI APIC Table: PTLTD  APIC  
ioapic0 Version 2.0 irqs 0-23 on motherboard
ioapic1 Version 2.0 irqs 24-47 on motherboard
ioapic2 Version 2.0 irqs 48-71 on motherboard
kbd1 at kbdmux0
acpi0: PTLTD   RSDT on motherboard
acpi0: Power Button (fixed)
Timecounter ACPI-fast frequency 3579545 Hz quality 1000
acpi_timer0: 24-bit timer at 3.579545MHz port 0x1008-0x100b on acpi0
cpu0: ACPI CPU on acpi0
acpi_throttle0: ACPI CPU Throttling on cpu0
pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0
pci0: ACPI PCI bus on pcib0
pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0
pci1: ACPI PCI bus on pcib1
pcib2: ACPI PCI-PCI bridge at device 0.0 on pci1
pci2: ACPI PCI bus on pcib2
pci1: base peripheral, interrupt controller at device 0.1 (no driver attached)
pcib3: ACPI PCI-PCI bridge at device 0.2 on pci1
pci3: ACPI PCI bus on pcib3
pci1: base peripheral, interrupt controller at device 0.3 (no driver attached)
pcib4: ACPI PCI-PCI bridge irq 17 at device 28.0 on pci0
pci4: ACPI PCI bus on pcib4
pcib5: ACPI PCI-PCI bridge irq 17 at device 28.4 on pci0
pci5: ACPI PCI bus on pcib5
em0: Intel(R) PRO/1000 Network Connection Version - 3.2.18 port 0x4000-0x401f 
mem 0xed20-0xed21 irq 16 at device 0.0 on pci5
em0: Ethernet address: 00:30:48:84:89:58
pcib6: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0
pci6: ACPI PCI bus on pcib6
em1: Intel(R) PRO/1000 Network Connection Version - 3.2.18 port 0x5000-0x501f 
mem 0xed30-0xed31 irq 17 at device 0.0 on pci6
em1: Ethernet address: 00:30:48:84:89:59
uhci0: UHCI (generic) USB controller port 0x3000-0x301f irq 23 at
device 29.0 on pci0
uhci0: [GIANT-LOCKED]
usb0: UHCI (generic) USB controller on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1: UHCI (generic) USB controller port 0x3020-0x303f irq 19 at
device 29.1 on pci0
uhci1: [GIANT-LOCKED]
usb1: UHCI (generic) USB controller on uhci1
usb1: USB revision 1.0
uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhci2: UHCI (generic) USB controller port 0x3040-0x305f irq 18 at
device 29.2 on pci0
uhci2: [GIANT-LOCKED]
usb2: UHCI (generic) USB controller on uhci2
usb2: USB revision 1.0
uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
uhci3: UHCI (generic) USB controller port 0x3060-0x307f irq 16 at device 29.3 
on pci0
uhci3: [GIANT-LOCKED]
usb3: UHCI (generic) USB controller on uhci3
usb3: USB revision 1.0
uhub3: Intel UHCI root hub, class 

Re: Stability of ICH7 sata on FreeBSD 6.1 ?

2006-08-07 Thread Dominic Marks

Jerome Sobecki wrote:

Hi all,

We have here some Supermicro Superserver 5015P-TR
(http://www.supermicro.com/products/system/1U/5015/SYS-5015P-TR.cfm)

Those servers, with a ICH7 controler, are currently working with FreeBSD
6.1 and everything seems ok, except that it's the third time, on two
different machines, that the system crash because it lost is hard drive.
  

We have a Subversion sever on a Dell box with an ICH7 chipset.
No problems so far (with Western Digital drives).

Dominic

The logs we get on console during the last crash (ad4s1g is /var, so we
don't have any other logs):
g_vsf_done() :ad4s1g[WRITE(offset=35657547776, length=16384)]error = 6
[...]
g_vsf_done() :ad4s1g[WRITE(offset=35662495744, length=16384)]error = 6
g_vsf_done() :ad4s1g[READ(offset=23900815360, , length=16384)]error = 6
handle_workitem_freeblocks: block count

The logs we have in /var/log/message during another crash :
Jul 26 19:34:07 munster2 kernel: ad6: FAILURE - device detached
Jul 26 19:34:07 munster2 kernel: subdisk6: detached
Jul 26 19:34:07 munster2 kernel: ad6: detached

When the machine crash, the led of the lost DD is fixed on, and a soft
reboot doesn't allow to get the disk back : an electric shutdown is
necessary.

Before the crash, servers had more than 1 mounth of uptime in
production, and others are still ok...

information about the machine :

vieux-lille2% uname -v 
FreeBSD 6.1-STABLE #3: Thu Jun  8 12:47:45 CEST 2006
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC 


It's right it was not up to date, but I didn't see cvs commit about that
problem (but maybe I simply miss it)

Does anyone had that problem ? Do you think updating the system from 6.1
sources will be enought ?

I let you dmesg, if it could help... :
vieux-lille2% dmesg
Copyright (c) 1992-2006 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights
reserved.
FreeBSD 6.1-STABLE #3: Thu Jun  8 12:47:45 CEST 2006
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC
Timecounter i8254 frequency 1193182 Hz quality 0
CPU: Intel(R) Pentium(R) 4 CPU 3.40GHz (3400.15-MHz 686-class CPU)
  Origin = GenuineIntel  Id = 0xf43  Stepping = 3
  
Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
  Features2=0x649dSSE3,RSVD2,MON,DS_CPL,EST,CNTX-ID,CX16,b14
  AMD Features=0x2010NX,LM
  Logical CPUs per core: 2
real memory  = 1072562176 (1022 MB)
avail memory = 1040637952 (992 MB)
ACPI APIC Table: PTLTD  APIC  
ioapic0 Version 2.0 irqs 0-23 on motherboard
ioapic1 Version 2.0 irqs 24-47 on motherboard
ioapic2 Version 2.0 irqs 48-71 on motherboard
kbd1 at kbdmux0
acpi0: PTLTD   RSDT on motherboard
acpi0: Power Button (fixed)
Timecounter ACPI-fast frequency 3579545 Hz quality 1000
acpi_timer0: 24-bit timer at 3.579545MHz port 0x1008-0x100b on acpi0
cpu0: ACPI CPU on acpi0
acpi_throttle0: ACPI CPU Throttling on cpu0
pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0
pci0: ACPI PCI bus on pcib0
pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0
pci1: ACPI PCI bus on pcib1
pcib2: ACPI PCI-PCI bridge at device 0.0 on pci1
pci2: ACPI PCI bus on pcib2
pci1: base peripheral, interrupt controller at device 0.1 (no driver attached)
pcib3: ACPI PCI-PCI bridge at device 0.2 on pci1
pci3: ACPI PCI bus on pcib3
pci1: base peripheral, interrupt controller at device 0.3 (no driver attached)
pcib4: ACPI PCI-PCI bridge irq 17 at device 28.0 on pci0
pci4: ACPI PCI bus on pcib4
pcib5: ACPI PCI-PCI bridge irq 17 at device 28.4 on pci0
pci5: ACPI PCI bus on pcib5
em0: Intel(R) PRO/1000 Network Connection Version - 3.2.18 port 0x4000-0x401f 
mem 0xed20-0xed21 irq 16 at device 0.0 on pci5
em0: Ethernet address: 00:30:48:84:89:58
pcib6: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0
pci6: ACPI PCI bus on pcib6
em1: Intel(R) PRO/1000 Network Connection Version - 3.2.18 port 0x5000-0x501f 
mem 0xed30-0xed31 irq 17 at device 0.0 on pci6
em1: Ethernet address: 00:30:48:84:89:59
uhci0: UHCI (generic) USB controller port 0x3000-0x301f irq 23 at
device 29.0 on pci0
uhci0: [GIANT-LOCKED]
usb0: UHCI (generic) USB controller on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1: UHCI (generic) USB controller port 0x3020-0x303f irq 19 at
device 29.1 on pci0
uhci1: [GIANT-LOCKED]
usb1: UHCI (generic) USB controller on uhci1
usb1: USB revision 1.0
uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhci2: UHCI (generic) USB controller port 0x3040-0x305f irq 18 at
device 29.2 on pci0
uhci2: [GIANT-LOCKED]
usb2: UHCI (generic) USB controller on uhci2
usb2: USB revision 1.0
uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
uhci3: UHCI (generic) USB controller port 0x3060-0x307f irq 

  1   2   >