subject:"shutdown"

On 6/19/2013 22:04, Jeremy Chadwick wrote:

On Wed, Jun 19, 2013 at 09:15:18PM +0700, Adam Strohl wrote:

On 6/19/2013 20:35, Jeremy Chadwick wrote:

I've snipped out portions which aren't relevant at this point in the
convo. I'm trying to be terse as much as possible here (honest).

To recap for readers/mailing list:

- Adam seems the same behaviour on systems on bare metal, as well as
FreeBSD guests running under VMware ESXi 5.0 hypervisor. However,
as I stated on the list just yesterday about "lock-ups on shutdown",
every situation may be different and there is a well-established
history of this problem on FreeBSD where each root cause (bugs)
were completely different from one another.

- The system we're discussing at this point in the thread is on
bare metal -- specifically an Asus P8B-X motherboard, with BIOS
version 6103, driven entirely by on-board Intel AHCI (not BIOS-level
RAID).

- Adam runs 9.1-RELEASE because of business needs pertaining to
freebsd-update and binary updates. (I ask more about this for
benefits of readers below, however -- because this situation comes
up a lot and I want to know what real-world admins do)

This is all correct.

Thanks. I was mainly interested in the storage controller being used
(in this case ahci(4)) and the disks being used (notorious ST3000DM001,
known for excessively parking heads).

Yeah, was not my first choice but then again ... RAIDZ-2 :) HD
supply chain here (Thailand) is weird considering how many are made
here (and can't buy). Smartd screams about them possibly needing a
firmware update (they don't according to Seagate). Had no issues
aside from a failure a month or so again (it's an HD ... it
happens).

Absolutely understood -- and FYI, in case you need backup, your thought
process/conclusion here is spot on (re: "it's a MHDD, failures happen").

Indeed :-D

Irrelevant to your shutdown problem: as for smartmontools bitching about
the firmware: no vendors disclose what actual changes go into their
drive firmware updates (vendors if you are reading this: I will have
your souls...), so I have to read a bunch of end-user forums where
nobody knows what they're talking about, and then of course find this
"highly educational" *cough* article from Adaptec:

http://ask.adaptec.com/app/answers/detail/a_id/17241/~/known-issues-with-seagate-barracuda-7200.14-desktop-drives

Yeah I agree .. I tried to firmware upgrade them when I was building the
system but it said they didn't qualify when using the boot ISO. I just
checked the site and it says no firmware update available too when using
their search by serial # tool. At this point I'm leery about updating
given that I've got data on it anyway. I do occasionally (maybe once a
week or two and they're in the same room as me/my office) hear one parking.

I see nothing wrong in smart though, no dmesg errors and have noticed no
issues with the array and it bench tests at around 850 MB/sec. Too bad
10 Gbit equipment isn't cheaper.

Also when I bought the 6 for this array I got a 7th as a cold spare :P

The problem here is that there have been *so many* firmware bugs with
Seagate's drives in the past 2 years or so that it's impossible for me
to know which fixes what. You buy what you buy because that's what you
buy, and that's cool -- but I avoid their stuff like the plague.

Yeah. I'd prefer WD myself but this place is swimming in "green" and
now "red" drives. uhgl.

<< Snipping out the unrelated parts ... >>

Can you try removing VESA and SC_PIXEL_MODE please? I know that
sounds crazy ("what on earth would that have to do with it?"), but
please try it. I can explain the justification if need be -- I'm being
extra paranoid of something that got discovered here on -stable only a
few days ago. It's a stretch, but I can see potential relevance. I can
provide details/links later.

No change unfortunately.

4. Does "sysctl hw.usb.no_shutdown_wait=1" help you?

Weirdly this allowed it to reboot on the first try (without needing
to be reset), but not the second.

I'm not surprised. Pleas re-try with stable/9; Hans has been constantly
working on the USB stack and fixing major bugs.

Got it but probably not going to go this route as it means no more
binary upgrades. While I can reboot it, it is the office NAS here
and so 'testing out' -STABLE I think probably isn't going to happen.

I understand. I have a question relating to this below.

Place background_fsck="no" in /etc/rc.conf. If the machine does not
have a clean filesystem on boot-up, you'll know because the system will
immediately begin fsck (in the foreground actively). You'll recognise
that output if it happens, trust me.

Preaching to the choir, we se

Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-19 Thread Warren Block


On Sun, 16 Jun 2013, Ian Lepore wrote:


On Sun, 2013-06-16 at 09:07 -0700, Jeremy Chadwick wrote:

On Sun, Jun 16, 2013 at 06:01:49PM +0200, Michiel Boland wrote:

On 06/16/2013 17:55, Jeremy Chadwick wrote:
[...]


Are you running moused(8)?  Actually, I can see quite clearly that you
are in your core.txt:

Starting ums0 moused.

Try turning that off.  Don't ask me how, because devd(8) / devd.conf(5)
might be involved.



The moused is started by devd - I don't see a quick way of turning that off.


Comment out the relevant crap in devd.conf(5).  Search for "ums"
and comment out the two "notify" sections.


I don't understand why people treat devd as if it's some sort of evil
virus that they're forced to live with (using phrases like "crap in
devd.conf").  In general, the standard devd rules tend to fall into 3
categories:
 * use logger(1) to record some anomaly
 * kldload a module
 * invoke a standard /etc/rc.d script

For moused, the devd rules invoke /etc/rc.d/moused, which implies that
setting moused_enable=NO in rc.conf would be all that's needed to
disable it.


Seems that way, but it's misleading.  Plug in a USB mouse, and devd will 
start moused anyway (with different options, but still...).  ISTR that 
can be disabled with


  moused_enable="NO"
  moused_nondefault_enable="NO"

I have not tested that lately.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount

2013-06-19 Thread Matthew D. Fuller

On Wed, Jun 19, 2013 at 09:52:00AM -0700 I heard the voice of
Jeremy Chadwick, and lo! it spake thus:
> 
> Justified in your environment, but not in mine -- where most of my
> systems (at home) are extremely quiet (1000-1200rpm fans, lots of
> noise dampening material, etc.).  A 10C increase *during idle* is
> enough to make me wary.

Mmm.  Well, some of them are in 1U cases, and so behind very loud
little fans (but that's in a datacenter where *I* don't have to hear
it).  But the ones sitting beside me are behind <1kRPM fans (80 and
120 mm), and are around 28-30c (which is a tad high; the filters are
overdue for cleaning).  And ambient is probably 24-25.  I'd be
seriously creeped out if an *active* drive were 10 over ambient, much
less if flipping some config setting moved anything 10.

(this is also why I _hate_ laptops...)


> On the other hand, their forum was *filled* with post after post
> about the issue, including one fellow whose drive in something like
> 3 months was almost reaching MTBF head park/reload count.

Oh, sure.  If you don't get the stupid things to stop, you can measure
their life with an egg timer.  The 400-some these drives got before I
turned APM off happened in, like, an afternoon.


> If you had what I do (moderate-to-severe IBS), you'd know that it
> definitely doesn't get passed back in a more concentrated form.
> First joke I've been able to make about my health condition, yeah!

Well, if your diet consists of hard drive manufacturer's souls, it's
no wonder your system got all screwed up!  You gotta find something to
eat with more moral fiber!;p


-- 
Matthew Fuller (MF4839)   |  fulle...@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
   On the Internet, nobody can hear you scream.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount

On Wed, Jun 19, 2013 at 11:34:39AM -0500, Matthew D. Fuller wrote:
> On Wed, Jun 19, 2013 at 09:16:35AM -0700 I heard the voice of
> Jeremy Chadwick, and lo! it spake thus:
> > 
> > The above CDB + subcommand disables APM entirely.  There is a lot
> > more to APM than just parking heads (and in all honesty, APM should
> > have nothing to do with parking heads).  Disabling APM can actually
> > have drastic effects on drive temperature (meaning there are certain
> > chip and/or motor operations that said feature controls *in
> > addition* to head parking), and other firmware-level features that
> > aren't documented.
> 
> True enough, in concept.  With all the drives sitting behind
> ventilation perfectly capable of dealing with 15kRPM drives, I don't
> worry about what that might do to the 7200's though...

Justified in your environment, but not in mine -- where most of my
systems (at home) are extremely quiet (1000-1200rpm fans, lots of noise
dampening material, etc.).  A 10C increase *during idle* is enough to
make me wary.  I also have extremely sensitive hearing, so drives
clicking is something I can hear from quite a distance -- I guess
working with them for so long over the years has made me sensitive to
'em.

> > Furthermore, that CDB does not work for all drives.  There are
> > Seagate drives -- I know because I bought some and returned them
> > when the APM trick did not work -- that lack the LCC-disable tie-in
> > to APM.  The drive either rejected the CDB (ATA status code error
> > returned), while others accepted it but nothing in 0xec (IDENTIFY)
> > reported as got changed.
> 
> Well, I haven't seen it with these.  Several of
> ada0:  ATA-8 SATA 3.x device
> and some systems with CC4C too.

The drives I was testing were STx000DM001.  I don't remember if I had a
DM002.  I also don't remember the firmware version they had on them, but
I do remember there were no updates available from Seagate at that time.
On the other hand, their forum was *filled* with post after post about
the issue, including one fellow whose drive in something like 3 months
was almost reaching MTBF head park/reload count.

But my point is this: 3.5" drives do not need this feature in 95% of
environments.  In desktop systems it's worthless -- in consumer desktops
it accomplishes nothing but noise and annoyance and impacts I/O, and in
business desktop desktop environments it serves no purpose because most
places have their desktops go into sleep mode (so drive standby/sleep
gets used).  And in the server environment it's pure 100% worthless.

With 2.5" drives I can see it being more useful, but only if the drive
is used in a laptop.  There are NASes (and now servers too!) which use
2.5" drives, and I sure as hell wouldn't want that happening there.

So really it's just a bad feature all around that should be specific to
one environment demographic; the vendors should have made a 2.5" drive
"dedicated for laptops" that had this feature enabled, while disabld on
all other drives (2.5" and 3.5").  What we got was nearly opposite.

> > I will have -- and eat -- their souls.
> 
> The problem with that is that the undigestible bits of "soul" just get
> passed right back into the ecosystem, and in a more concentrated form.
> 
> Some might suggest that's already happened, and is got us here in the
> first place  8-}

If you had what I do (moderate-to-severe IBS), you'd know that it
definitely doesn't get passed back in a more concentrated form.  First
joke I've been able to make about my health condition, yeah!  Ha!  I
kill me! -- Alf

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount

2013-06-19 Thread Matthew D. Fuller

On Wed, Jun 19, 2013 at 09:16:35AM -0700 I heard the voice of
Jeremy Chadwick, and lo! it spake thus:
> 
> The above CDB + subcommand disables APM entirely.  There is a lot
> more to APM than just parking heads (and in all honesty, APM should
> have nothing to do with parking heads).  Disabling APM can actually
> have drastic effects on drive temperature (meaning there are certain
> chip and/or motor operations that said feature controls *in
> addition* to head parking), and other firmware-level features that
> aren't documented.

True enough, in concept.  With all the drives sitting behind
ventilation perfectly capable of dealing with 15kRPM drives, I don't
worry about what that might do to the 7200's though...


> Furthermore, that CDB does not work for all drives.  There are
> Seagate drives -- I know because I bought some and returned them
> when the APM trick did not work -- that lack the LCC-disable tie-in
> to APM.  The drive either rejected the CDB (ATA status code error
> returned), while others accepted it but nothing in 0xec (IDENTIFY)
> reported as got changed.

Well, I haven't seen it with these.  Several of
ada0:  ATA-8 SATA 3.x device
and some systems with CC4C too.


> I will have -- and eat -- their souls.

The problem with that is that the undigestible bits of "soul" just get
passed right back into the ecosystem, and in a more concentrated form.

Some might suggest that's already happened, and is got us here in the
first place  8-}


-- 
Matthew Fuller (MF4839)   |  fulle...@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
   On the Internet, nobody can hear you scream.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount

On Wed, Jun 19, 2013 at 10:53:46AM -0500, Matthew D. Fuller wrote:
> On Wed, Jun 19, 2013 at 08:04:14AM -0700 I heard the voice of
> Jeremy Chadwick, and lo! it spake thus:
> > 
> > 
> > Readers: if any of you have a ST[123]000DM001 drive running the CC24
> > firmware, and can confirm high head parking counts (SMART attribute
> > 193), and are willing to upgrade your drive firmware to the latest then
> > see if the LCC increments stop (or at least settle down to normal
> > levels), I'd love to hear from you.  I have been socially boycotting
> > these models of drives because of that idiotic firmware design choice
> > for quite some time now (not to mention the parking on those drives
> > is audibly loud in a normal living room), and if the F/W actually
> > inhibits the excessive parking then I have some drives to consider
> > upgrading.  :-)
> > 
> 
> I dunno about firmware, but you can smack 'em with a big hammer...
> 
> /etc/rc.local:
> for i in 0 1; do
> /sbin/camcontrol cmd ada${i} -a "EF 85 00 00 00 00 00 00 00 00 00 00"
> done
> 
> x-ref:
> http://lists.freebsd.org/pipermail/freebsd-stable/2009-November/052997.html
> 
> 
> LCC was somewhere in the upper 400's (I wanna say 480-some?) a year
> and change ago when I dropped that in.  It's 506/493 now on the two
> drives.

The above CDB + subcommand disables APM entirely.  There is a lot more
to APM than just parking heads (and in all honesty, APM should have
nothing to do with parking heads).  Disabling APM can actually have
drastic effects on drive temperature (meaning there are certain chip
and/or motor operations that said feature controls *in addition* to head
parking), and other firmware-level features that aren't documented.

Furthermore, that CDB does not work for all drives.  There are Seagate
drives -- I know because I bought some and returned them when the APM
trick did not work -- that lack the LCC-disable tie-in to APM.  The
drive either rejected the CDB (ATA status code error returned), while
others accepted it but nothing in 0xec (IDENTIFY) reported as got
changed.

The only model of drive I know that reliably works with this method is
the WD Green/-GP drive, and the drive temperatures do increase.  No idea
on the Blues.  (Another reason I recommend the Reds...)

What *should* have happened is that a new 0xef subcommand should have
been created for this.  Subs range from 0x00-0xff.  T13 spec shows
that a huge number of them (I'd say 30% or more) are marked "Reserved"
and an additional 30% or so are marked "Obsolete".  And finally,
0x56-0x5c, 0xd6-0xdc and 0xe0 are "Vendor Specific".

But looking at this from a more general view, the real issue is that
these types of features should not have been introduced to begin with.
The vendors introduced this problem, and now are marketing drives with
said feature disabled, claiming "we fixed the problem that annoys so
many of you!" -- the same problem **they introduced without asking
anyone**.

I will have -- and eat -- their souls.

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount

2013-06-19 Thread Matthew D. Fuller

On Wed, Jun 19, 2013 at 08:04:14AM -0700 I heard the voice of
Jeremy Chadwick, and lo! it spake thus:
> 
> 
> Readers: if any of you have a ST[123]000DM001 drive running the CC24
> firmware, and can confirm high head parking counts (SMART attribute
> 193), and are willing to upgrade your drive firmware to the latest then
> see if the LCC increments stop (or at least settle down to normal
> levels), I'd love to hear from you.  I have been socially boycotting
> these models of drives because of that idiotic firmware design choice
> for quite some time now (not to mention the parking on those drives
> is audibly loud in a normal living room), and if the F/W actually
> inhibits the excessive parking then I have some drives to consider
> upgrading.  :-)
> 

I dunno about firmware, but you can smack 'em with a big hammer...

/etc/rc.local:
for i in 0 1; do
/sbin/camcontrol cmd ada${i} -a "EF 85 00 00 00 00 00 00 00 00 00 00"
done

x-ref:
http://lists.freebsd.org/pipermail/freebsd-stable/2009-November/052997.html


LCC was somewhere in the upper 400's (I wanna say 480-some?) a year
and change ago when I dropped that in.  It's 506/493 now on the two
drives.


-- 
Matthew Fuller (MF4839)   |  fulle...@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
   On the Internet, nobody can hear you scream.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount

On Wed, Jun 19, 2013 at 09:15:18PM +0700, Adam Strohl wrote:
> On 6/19/2013 20:35, Jeremy Chadwick wrote:

I've snipped out portions which aren't relevant at this point in the
convo.  I'm trying to be terse as much as possible here (honest).

To recap for readers/mailing list:

- Adam seems the same behaviour on systems on bare metal, as well as
  FreeBSD guests running under VMware ESXi 5.0 hypervisor.  However,
  as I stated on the list just yesterday about "lock-ups on shutdown",
  every situation may be different and there is a well-established
  history of this problem on FreeBSD where each root cause (bugs)
  were completely different from one another.

- The system we're discussing at this point in the thread is on
  bare metal -- specifically an Asus P8B-X motherboard, with BIOS
  version 6103, driven entirely by on-board Intel AHCI (not BIOS-level
  RAID).

- Adam runs 9.1-RELEASE because of business needs pertaining to
  freebsd-update and binary updates.  (I ask more about this for
  benefits of readers below, however -- because this situation comes
  up a lot and I want to know what real-world admins do)

> >Thanks.  I was mainly interested in the storage controller being used
> >(in this case ahci(4)) and the disks being used (notorious ST3000DM001,
> >known for excessively parking heads).
> 
> Yeah, was not my first choice but then again ... RAIDZ-2 :)  HD
> supply chain here (Thailand) is weird considering how many are made
> here (and can't buy).  Smartd screams about them possibly needing a
> firmware update (they don't according to Seagate).   Had no issues
> aside from a failure a month or so again (it's an HD ... it
> happens).

Absolutely understood -- and FYI, in case you need backup, your thought
process/conclusion here is spot on (re: "it's a MHDD, failures happen").

Irrelevant to your shutdown problem: as for smartmontools bitching about
the firmware: no vendors disclose what actual changes go into their
drive firmware updates (vendors if you are reading this: I will have
your souls...), so I have to read a bunch of end-user forums where
nobody knows what they're talking about, and then of course find this
"highly educational" *cough* article from Adaptec:

http://ask.adaptec.com/app/answers/detail/a_id/17241/~/known-issues-with-seagate-barracuda-7200.14-desktop-drives

The problem here is that there have been *so many* firmware bugs with
Seagate's drives in the past 2 years or so that it's impossible for me
to know which fixes what.  You buy what you buy because that's what you
buy, and that's cool -- but I avoid their stuff like the plague.

Readers: if any of you have a ST[123]000DM001 drive running the CC24
firmware, and can confirm high head parking counts (SMART attribute
193), and are willing to upgrade your drive firmware to the latest then
see if the LCC increments stop (or at least settle down to normal
levels), I'd love to hear from you.  I have been socially boycotting
these models of drives because of that idiotic firmware design choice
for quite some time now (not to mention the parking on those drives
is audibly loud in a normal living room), and if the F/W actually
inhibits the excessive parking then I have some drives to consider
upgrading.  :-)

> >I can also see you're running your own kernel.  We'll get to that in a
> >moment.
> 
> It's GENERIC with the following added to the end:
> 
> # -- Add Support for nicer console
> #
> options VESA
> options SC_PIXEL_MODE

Can you try removing VESA and SC_PIXEL_MODE please?  I know that
sounds crazy ("what on earth would that have to do with it?"), but
please try it.  I can explain the justification if need be -- I'm being
extra paranoid of something that got discovered here on -stable only a
few days ago.  It's a stretch, but I can see potential relevance.  I can
provide details/links later.

> >>>4. Does "sysctl hw.usb.no_shutdown_wait=1" help you?
> >>
> >>Weirdly this allowed it to reboot on the first try (without needing
> >>to be reset), but not the second.
> >
> >I'm not surprised.  Pleas re-try with stable/9; Hans has been constantly
> >working on the USB stack and fixing major bugs.
> 
> Got it but probably not going to go this route as it means no more
> binary upgrades.  While I can reboot it, it is the office NAS here
> and so 'testing out' -STABLE I think probably isn't going to happen.

I understand.  I have a question relating to this below.

> >Place background_fsck="no" in /etc/rc.conf.  If the machine does not
> >have a clean filesystem on boot-up, you'll know because the system will
> >immediately begin fsck (in the foreground actively).  You'll recog

Re: shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount

- Original Message - 
From: "Adam Strohl" 

To: "Steven Hartland" 
Cc: "Jeremy Chadwick" ; 
Sent: Wednesday, June 19, 2013 3:29 PM
Subject: Re: shutdown -r / shutdown -h / reboot all hang and don't cleanly 
dismount

On 6/19/2013 21:21, Steven Hartland wrote:

You still need to test if stable/9 fixes your issue though as otherwise
you don't know if the issue your seeing has already been fixed, and if
its the old know ZFS vfs hang on shutdown, it has.

Thanks Steve, understood but probably not going to happen with this box. 
 I can reboot this thing but it's our NAS and not a test bed.  This 
problem on this machine isn't a big deal because its a server and not 
rebooted often (and easy to bring back).  But I more was hoping it would 
let me easily test solutions to the issue since the other servers 
showing the issue are in client production with the mind that the VMs 
not use ZFS also show a similar/identical issue  My gut says it 
appeared in/with 9.1 (We never saw this with 9.0 servers).   It is also 
possible this is a different issue from those other servers and VMs.

How far away is 9.2? ;-P

Depending on how things go with Jeremy I'll probably have to wait this 
out unless I can get a test machine or VM where I can reproduce the 
issue AND upgrade it to -STABLE (again assuming it's even the same issue).

Don't rule out there being more than one issue at play.

   Regards
   Steve

This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to postmas...@multiplay.co.uk.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount


On 6/19/2013 21:21, Steven Hartland wrote:

You still need to test if stable/9 fixes your issue though as otherwise
you don't know if the issue your seeing has already been fixed, and if
its the old know ZFS vfs hang on shutdown, it has.


Thanks Steve, understood but probably not going to happen with this box. 
 I can reboot this thing but it's our NAS and not a test bed.  This 
problem on this machine isn't a big deal because its a server and not 
rebooted often (and easy to bring back).  But I more was hoping it would 
let me easily test solutions to the issue since the other servers 
showing the issue are in client production with the mind that the VMs 
not use ZFS also show a similar/identical issue  My gut says it 
appeared in/with 9.1 (We never saw this with 9.0 servers).   It is also 
possible this is a different issue from those other servers and VMs.


How far away is 9.2? ;-P

Depending on how things go with Jeremy I'll probably have to wait this 
out unless I can get a test machine or VM where I can reproduce the 
issue AND upgrade it to -STABLE (again assuming it's even the same issue).

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount

- Original Message - 
From: "Adam Strohl" 

To: "Jeremy Chadwick" 
Cc: 
Sent: Wednesday, June 19, 2013 3:15 PM
Subject: Re: shutdown -r / shutdown -h / reboot all hang and don't cleanly 
dismount

On 6/19/2013 20:35, Jeremy Chadwick wrote:

Nope, I see basically the same thing sometimes under ESXi 5.0
Hypervisor (and yes it worries me the implications of something so
broad).  Those unites I just haven't been able to isolate on a
server which isn't critical.  Lets focus on this server for now
though per your suggestion below.

I'm sorry but I don't understand your first sentence -- the first part
of your sentence says "nope" (I have to assume in reply to my "on bare
metal" part), but then says "I see basically the same thing sometimes
under ESXi" which implies an alternate environment in comparison (i.e.
we *are* talking about bare metal).  Consider me confused.  :-)

Basically: The issue is extremely similar if not the same root cause, be 
it a native or virtual server.  This server though is native.

2. We need to know what version of "9.1" you're using, i.e. 9.1-RELEASE.
If you use stable/9 (RELENG_9) we need to see uname -a output (you can
hide the machine name if you want).

Sorry, this ZFS box is 9.1-R P4 (kernel built today):

FreeBSD ilos.dsn 9.1-RELEASE-p4 FreeBSD 9.1-RELEASE-p4 #6: Wed Jun
19 15:31:12 ICT 2013
root@hostname:/usr/obj/usr/src/sys/ATEAMSYSTEMS  amd64

I suggest trying stable/9 (and staying with it, for that matter).

The issue is no binary updates and we have a large deploy base, so we've 
stuck with -R and use it internally because it's what we deploy.

You still need to test if stable/9 fixes your issue though as otherwise
you don't know if the issue your seeing has already been fixed, and if
its the old know ZFS vfs hang on shutdown, it has.

   Regards
   Steve

This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to postmas...@multiplay.co.uk.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount


On 6/19/2013 20:35, Jeremy Chadwick wrote:


Nope, I see basically the same thing sometimes under ESXi 5.0
Hypervisor (and yes it worries me the implications of something so
broad).  Those unites I just haven't been able to isolate on a
server which isn't critical.  Lets focus on this server for now
though per your suggestion below.


I'm sorry but I don't understand your first sentence -- the first part
of your sentence says "nope" (I have to assume in reply to my "on bare
metal" part), but then says "I see basically the same thing sometimes
under ESXi" which implies an alternate environment in comparison (i.e.
we *are* talking about bare metal).  Consider me confused.  :-)


Basically: The issue is extremely similar if not the same root cause, be 
it a native or virtual server.  This server though is native.





2. We need to know what version of "9.1" you're using, i.e. 9.1-RELEASE.
If you use stable/9 (RELENG_9) we need to see uname -a output (you can
hide the machine name if you want).


Sorry, this ZFS box is 9.1-R P4 (kernel built today):

FreeBSD ilos.dsn 9.1-RELEASE-p4 FreeBSD 9.1-RELEASE-p4 #6: Wed Jun
19 15:31:12 ICT 2013
root@hostname:/usr/obj/usr/src/sys/ATEAMSYSTEMS  amd64


I suggest trying stable/9 (and staying with it, for that matter).


The issue is no binary updates and we have a large deploy base, so we've 
stuck with -R and use it internally because it's what we deploy.





3. Can we please have dmesg from this machine?  The controller and some
other hardware details matter.


Sure take a look at the full log here: http://pastebin.com/k55gVVuU

This includes a boot, then a reboot as I describe (you can see it
logs the All Buffers Synced, etc) then powering back on.


Thanks.  I was mainly interested in the storage controller being used
(in this case ahci(4)) and the disks being used (notorious ST3000DM001,
known for excessively parking heads).


Yeah, was not my first choice but then again ... RAIDZ-2 :)  HD supply 
chain here (Thailand) is weird considering how many are made here (and 
can't buy).  Smartd screams about them possibly needing a firmware 
update (they don't according to Seagate).   Had no issues aside from a 
failure a month or so again (it's an HD ... it happens).



AFAIK this isn't one of the
controllers that was known for weird "quirky issues" pertaining to
flushing data to disk on shutdown.

I have to ask: is this FreeBSD box running under a HV?


No, native/direct for sure on this one.



If it *is not* running under a HV, could we please get exact motherboard
model and version (including BIOS version)?  Sometimes (not always) you
can get this from "kenv | grep smbios."


No problem I built this one personally:

Asus P8B-X BIOS revision 6103




I can also see you're running your own kernel.  We'll get to that in a
moment.


It's GENERIC with the following added to the end:

# -- Add Support for nicer console
#
options VESA
options SC_PIXEL_MODE

# -- PF Support
#
device pf
device pflog
device pfsync

# -- Core temperature reporting
#
device  coretemp # For Intel CPUs

device  smbios




4. Does "sysctl hw.usb.no_shutdown_wait=1" help you?


Weirdly this allowed it to reboot on the first try (without needing
to be reset), but not the second.


I'm not surprised.  Pleas re-try with stable/9; Hans has been constantly
working on the USB stack and fixing major bugs.


Got it but probably not going to go this route as it means no more 
binary upgrades.  While I can reboot it, it is the office NAS here and 
so 'testing out' -STABLE I think probably isn't going to happen.





The "Starting background file
system checks in 60 seconds" message appeared ... that only happens
when something is dirty, right?


No it does not.  That message is always printed when you use background
fsck, which is the default.


Got it.



I do not advocate using background fsck, because it has been known (and
may still do this -- I do not care to find out, I do not have time for
unreliable filesystem nonsense) to not always fix all filesystem
problems.  Meaning: people using background fsck have been known to boot
into single-user and issue "fsck" manually and find issues.

Place background_fsck="no" in /etc/rc.conf.  If the machine does not
have a clean filesystem on boot-up, you'll know because the system will
immediately begin fsck (in the foreground actively).  You'll recognise
that output if it happens, trust me.


Preaching to the choir, we set this on all servers this one somehow did 
not have it set (I think due to ZFS making it unique and not copying our 
rc.conf template over properly).





So the second try with just this I could ctrl alt del it and it
responded .. kind of:
http://i.imgur.com/POAIaNg.jpg

Still had to reset it though.


This looks like a c

Re: shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount

- Original Message - 
From: "Ronald Klop" 



On Wed, 19 Jun 2013 14:53:19 +0200, Adam Strohl  
 wrote:



On 6/19/2013 19:21, Jeremy Chadwick wrote:

On Wed, Jun 19, 2013 at 06:35:57PM +0700, Adam Strohl wrote:

Hello -STABLE@,

So I've seen this situation seemingly randomly on a number of both
physical 9.1 boxes as well as VMs for I would say 6-9 months at
least.  I finally have a physical box here that reproduces it
consistently that I can reboot easily (ie; not a production/client
server).


Hi,

My home computer had the same symptom (not rebooting after 'all buffers  
flushed' message) a couple of months ago. But I follow 9-STABLE and the  
problem is gone for a while now.


avg@ did a lot of work on the ZFS vfs locking which fixed at least one
hang on reboot for ZFS. I don't believe this is in 9.1-RELEASE, so you
should test a stable/9 or 8.4-RELEASE (which is newer than 9.1-RELEASE)
kernel.

   Regards
   Steve


This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 


In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to postmas...@multiplay.co.uk.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount

On Wed, Jun 19, 2013 at 07:53:19PM +0700, Adam Strohl wrote:
> On 6/19/2013 19:21, Jeremy Chadwick wrote:
> >On Wed, Jun 19, 2013 at 06:35:57PM +0700, Adam Strohl wrote:
> >>Hello -STABLE@,
> >>
> >>So I've seen this situation seemingly randomly on a number of both
> >>physical 9.1 boxes as well as VMs for I would say 6-9 months at
> >>least.  I finally have a physical box here that reproduces it
> >>consistently that I can reboot easily (ie; not a production/client
> >>server).
> >>
> >>No matter what I do:
> >>
> >>reboot
> >>shutdown -p
> >>shutdown -r
> >>
> >>This specific server will stop at "All buffers synced" and not
> >>actually power down or reboot.  KB input seems to be ignored.  This
> >>server is a ZFS NAS (with GMIRROR for boot blocks) but the other
> >>boxes which show this are using GMIRRORs for root/swap/boot (no
> >>ZFS).
> >>
> >>Here is what happens on the console: http://i.imgur.com/1H8JMyB.jpg
> >>
> >>When I reset the server it appears that disks were not dismounted
> >>cleanly ... on this ZFS box it comes back quick because ZFS is good
> >>like that but on the other servers with GMIRROR roots rebuilding the
> >>GMIRROR and fscking at the same time is murder on the
> >>disk/performance until it finishes.
> >
> >1. You mention "as well as VMs".  Anything under a "virtual machine" or
> >under a hypervisor is going to be very, very, **VERY** different than
> >bare metal.  So I hope the issues you're talking about above are on bare
> >metal -- I will assume so.
> 
> Nope, I see basically the same thing sometimes under ESXi 5.0
> Hypervisor (and yes it worries me the implications of something so
> broad).  Those unites I just haven't been able to isolate on a
> server which isn't critical.  Lets focus on this server for now
> though per your suggestion below.

I'm sorry but I don't understand your first sentence -- the first part
of your sentence says "nope" (I have to assume in reply to my "on bare
metal" part), but then says "I see basically the same thing sometimes
under ESXi" which implies an alternate environment in comparison (i.e.
we *are* talking about bare metal).  Consider me confused.  :-)

> >2. We need to know what version of "9.1" you're using, i.e. 9.1-RELEASE.
> >If you use stable/9 (RELENG_9) we need to see uname -a output (you can
> >hide the machine name if you want).
> 
> Sorry, this ZFS box is 9.1-R P4 (kernel built today):
> 
> FreeBSD ilos.dsn 9.1-RELEASE-p4 FreeBSD 9.1-RELEASE-p4 #6: Wed Jun
> 19 15:31:12 ICT 2013
> root@hostname:/usr/obj/usr/src/sys/ATEAMSYSTEMS  amd64

I suggest trying stable/9 (and staying with it, for that matter).

> >3. Can we please have dmesg from this machine?  The controller and some
> >other hardware details matter.
> 
> Sure take a look at the full log here: http://pastebin.com/k55gVVuU
> 
> This includes a boot, then a reboot as I describe (you can see it
> logs the All Buffers Synced, etc) then powering back on.

Thanks.  I was mainly interested in the storage controller being used
(in this case ahci(4)) and the disks being used (notorious ST3000DM001,
known for excessively parking heads).  AFAIK this isn't one of the
controllers that was known for weird "quirky issues" pertaining to
flushing data to disk on shutdown.

I have to ask: is this FreeBSD box running under a HV?

If it *is not* running under a HV, could we please get exact motherboard
model and version (including BIOS version)?  Sometimes (not always) you
can get this from "kenv | grep smbios."

I can also see you're running your own kernel.  We'll get to that in a
moment.

> >4. Does "sysctl hw.usb.no_shutdown_wait=1" help you?
> 
> Weirdly this allowed it to reboot on the first try (without needing
> to be reset), but not the second.

I'm not surprised.  Pleas re-try with stable/9; Hans has been constantly
working on the USB stack and fixing major bugs.

> The "Starting background file
> system checks in 60 seconds" message appeared ... that only happens
> when something is dirty, right?

No it does not.  That message is always printed when you use background
fsck, which is the default.

I do not advocate using background fsck, because it has been known (and
may still do this -- I do not care to find out, I do not have time for
unreliable filesystem nonsense) to not always fix all filesystem
problems.  Meaning: people using background fsck have been known to boot
into single-user and issue "fsck" manually and find iss

Re: shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount

2013-06-19 Thread Ronald Klop

On Wed, 19 Jun 2013 14:53:19 +0200, Adam Strohl  
 wrote:



On 6/19/2013 19:21, Jeremy Chadwick wrote:

On Wed, Jun 19, 2013 at 06:35:57PM +0700, Adam Strohl wrote:

Hello -STABLE@,

So I've seen this situation seemingly randomly on a number of both
physical 9.1 boxes as well as VMs for I would say 6-9 months at
least.  I finally have a physical box here that reproduces it
consistently that I can reboot easily (ie; not a production/client
server).


Hi,

My home computer had the same symptom (not rebooting after 'all buffers  
flushed' message) a couple of months ago. But I follow 9-STABLE and the  
problem is gone for a while now.


Ronald.



No matter what I do:

reboot
shutdown -p
shutdown -r

This specific server will stop at "All buffers synced" and not
actually power down or reboot.  KB input seems to be ignored.  This
server is a ZFS NAS (with GMIRROR for boot blocks) but the other
boxes which show this are using GMIRRORs for root/swap/boot (no
ZFS).

Here is what happens on the console: http://i.imgur.com/1H8JMyB.jpg

When I reset the server it appears that disks were not dismounted
cleanly ... on this ZFS box it comes back quick because ZFS is good
like that but on the other servers with GMIRROR roots rebuilding the
GMIRROR and fscking at the same time is murder on the
disk/performance until it finishes.


1. You mention "as well as VMs".  Anything under a "virtual machine" or
under a hypervisor is going to be very, very, **VERY** different than
bare metal.  So I hope the issues you're talking about above are on bare
metal -- I will assume so.


Nope, I see basically the same thing sometimes under ESXi 5.0 Hypervisor  
(and yes it worries me the implications of something so broad).  Those  
unites I just haven't been able to isolate on a server which isn't  
critical.  Lets focus on this server for now though per your suggestion  
below.




2. We need to know what version of "9.1" you're using, i.e. 9.1-RELEASE.
If you use stable/9 (RELENG_9) we need to see uname -a output (you can
hide the machine name if you want).


Sorry, this ZFS box is 9.1-R P4 (kernel built today):

FreeBSD ilos.dsn 9.1-RELEASE-p4 FreeBSD 9.1-RELEASE-p4 #6: Wed Jun 19  
15:31:12 ICT 2013 root@hostname:/usr/obj/usr/src/sys/ATEAMSYSTEMS   
amd64




3. Can we please have dmesg from this machine?  The controller and some
other hardware details matter.


Sure take a look at the full log here: http://pastebin.com/k55gVVuU

This includes a boot, then a reboot as I describe (you can see it logs  
the All Buffers Synced, etc) then powering back on.




4. Does "sysctl hw.usb.no_shutdown_wait=1" help you?


Weirdly this allowed it to reboot on the first try (without needing to  
be reset), but not the second.  The "Starting background file system  
checks in 60 seconds" message appeared ... that only happens when  
something is dirty, right?


So the second try with just this I could ctrl alt del it and it  
responded .. kind of:

http://i.imgur.com/POAIaNg.jpg

Still had to reset it though.



5. Does "sysctl hw.acpi.handle_reboot=1" help you?


No change, still responded to a ctrl alt del like above, but like that  
still needs to be reset and comes back dirty.




6. Does "sysctl hw.acpi.disable_on_reboot=1" help you?


No change.  Same as above, ctrl alt del responds but needs a hard reset  
still.




7. If none of the above helps, can you please boot verbose mode and then
when the system "locks up" on "shutdown -r now" take a picture of the
VGA console?


Lots of debug on boot obviously but not much different on shutdown/hang:
http://i.imgur.com/SgzSsoP.jpg



8. Does the machine run moused(8) (check the process list please, do not
rely on rc.conf) ?


ps -auxww | grep moused reveals nothing running (which is how I have  
things set).





Another interesting thing is that this particular server runs slapd
(OpenLDAP) which, when it comes back up, has a "corrupted" DB
(easily fixed with db_recover, but still).  This might be because FS
commits aren't happening at the end.   I can even manually stop
slapd (service slapd stop) then run sync(8) (I assume this does
something for ZFS too) and it still comes back as hosed if I reboot
shortly after.  If I start/stop slapd it's fine.  So I feel like
there is an FS/dismount thing going on here.


sync(8) does not do what you think it does.  Please read (not skim) this
entire thread starting here:

http://lists.freebsd.org/pipermail/freebsd-fs/2013-April/thread.html#16982
http://lists.freebsd.org/pipermail/freebsd-fs/2013-April/016982.html


Groking this now ..



Your problem is related to unclean shutdown; fix that and your issues go
away.


Yeah that is my feeling as well.




Additional information: I also have some boxes which will reboot
(ie; they don't freeze like some do at the end) but they don't
dismount cle

Re: shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount


On 6/19/2013 19:53, Adam Strohl wrote:

sync(8) does not do what you think it does.  Please read (not skim) this
entire thread starting here:

http://lists.freebsd.org/pipermail/freebsd-fs/2013-April/thread.html#16982

http://lists.freebsd.org/pipermail/freebsd-fs/2013-April/016982.html


Groking this now ..



Epic.  So basically "mount -u -o ro " is really what I (and probably 
everyone else) wants and the man page needs a major overhaul + 
disclaimer (and possibly a recommendation to use "mount -u -o ro " 
instead).



--
Adam Strohl
http://www.ateamsystems.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount


On 6/19/2013 19:21, Jeremy Chadwick wrote:

On Wed, Jun 19, 2013 at 06:35:57PM +0700, Adam Strohl wrote:

Hello -STABLE@,

So I've seen this situation seemingly randomly on a number of both
physical 9.1 boxes as well as VMs for I would say 6-9 months at
least.  I finally have a physical box here that reproduces it
consistently that I can reboot easily (ie; not a production/client
server).

No matter what I do:

reboot
shutdown -p
shutdown -r

This specific server will stop at "All buffers synced" and not
actually power down or reboot.  KB input seems to be ignored.  This
server is a ZFS NAS (with GMIRROR for boot blocks) but the other
boxes which show this are using GMIRRORs for root/swap/boot (no
ZFS).

Here is what happens on the console: http://i.imgur.com/1H8JMyB.jpg

When I reset the server it appears that disks were not dismounted
cleanly ... on this ZFS box it comes back quick because ZFS is good
like that but on the other servers with GMIRROR roots rebuilding the
GMIRROR and fscking at the same time is murder on the
disk/performance until it finishes.


1. You mention "as well as VMs".  Anything under a "virtual machine" or
under a hypervisor is going to be very, very, **VERY** different than
bare metal.  So I hope the issues you're talking about above are on bare
metal -- I will assume so.


Nope, I see basically the same thing sometimes under ESXi 5.0 Hypervisor 
(and yes it worries me the implications of something so broad).  Those 
unites I just haven't been able to isolate on a server which isn't 
critical.  Lets focus on this server for now though per your suggestion 
below.




2. We need to know what version of "9.1" you're using, i.e. 9.1-RELEASE.
If you use stable/9 (RELENG_9) we need to see uname -a output (you can
hide the machine name if you want).


Sorry, this ZFS box is 9.1-R P4 (kernel built today):

FreeBSD ilos.dsn 9.1-RELEASE-p4 FreeBSD 9.1-RELEASE-p4 #6: Wed Jun 19 
15:31:12 ICT 2013 root@hostname:/usr/obj/usr/src/sys/ATEAMSYSTEMS  amd64




3. Can we please have dmesg from this machine?  The controller and some
other hardware details matter.


Sure take a look at the full log here: http://pastebin.com/k55gVVuU

This includes a boot, then a reboot as I describe (you can see it logs 
the All Buffers Synced, etc) then powering back on.




4. Does "sysctl hw.usb.no_shutdown_wait=1" help you?


Weirdly this allowed it to reboot on the first try (without needing to 
be reset), but not the second.  The "Starting background file system 
checks in 60 seconds" message appeared ... that only happens when 
something is dirty, right?


So the second try with just this I could ctrl alt del it and it 
responded .. kind of:

http://i.imgur.com/POAIaNg.jpg

Still had to reset it though.



5. Does "sysctl hw.acpi.handle_reboot=1" help you?


No change, still responded to a ctrl alt del like above, but like that 
still needs to be reset and comes back dirty.




6. Does "sysctl hw.acpi.disable_on_reboot=1" help you?


No change.  Same as above, ctrl alt del responds but needs a hard reset 
still.




7. If none of the above helps, can you please boot verbose mode and then
when the system "locks up" on "shutdown -r now" take a picture of the
VGA console?


Lots of debug on boot obviously but not much different on shutdown/hang:
http://i.imgur.com/SgzSsoP.jpg



8. Does the machine run moused(8) (check the process list please, do not
rely on rc.conf) ?


ps -auxww | grep moused reveals nothing running (which is how I have 
things set).





Another interesting thing is that this particular server runs slapd
(OpenLDAP) which, when it comes back up, has a "corrupted" DB
(easily fixed with db_recover, but still).  This might be because FS
commits aren't happening at the end.   I can even manually stop
slapd (service slapd stop) then run sync(8) (I assume this does
something for ZFS too) and it still comes back as hosed if I reboot
shortly after.  If I start/stop slapd it's fine.  So I feel like
there is an FS/dismount thing going on here.


sync(8) does not do what you think it does.  Please read (not skim) this
entire thread starting here:

http://lists.freebsd.org/pipermail/freebsd-fs/2013-April/thread.html#16982
http://lists.freebsd.org/pipermail/freebsd-fs/2013-April/016982.html


Groking this now ..



Your problem is related to unclean shutdown; fix that and your issues go
away.


Yeah that is my feeling as well.




Additional information: I also have some boxes which will reboot
(ie; they don't freeze like some do at the end) but they don't
dismount cleanly either and have to rebuild both GMIRROR and fsck.
This might be a different issue, too.


Every issue needs to be handled/treated separately.


Sure, I just had run across some threads about that but will focus on 
this ZFS box (and see

Re: shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount

OS version?
- Original Message - 
From: "Adam Strohl" 

To: 
Sent: Wednesday, June 19, 2013 12:35 PM
Subject: shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount

Hello -STABLE@,

So I've seen this situation seemingly randomly on a number of both 
physical 9.1 boxes as well as VMs for I would say 6-9 months at least. 
 I finally have a physical box here that reproduces it consistently 
that I can reboot easily (ie; not a production/client server).

No matter what I do:

reboot
shutdown -p
shutdown -r

This specific server will stop at "All buffers synced" and not actually 
power down or reboot.  KB input seems to be ignored.  This server is a 
ZFS NAS (with GMIRROR for boot blocks) but the other boxes which show 
this are using GMIRRORs for root/swap/boot (no ZFS).

Here is what happens on the console: http://i.imgur.com/1H8JMyB.jpg

When I reset the server it appears that disks were not dismounted 
cleanly ... on this ZFS box it comes back quick because ZFS is good like 
that but on the other servers with GMIRROR roots rebuilding the GMIRROR 
and fscking at the same time is murder on the disk/performance until it 
finishes.

Another interesting thing is that this particular server runs slapd 
(OpenLDAP) which, when it comes back up, has a "corrupted" DB (easily 
fixed with db_recover, but still).  This might be because FS commits 
aren't happening at the end.   I can even manually stop slapd (service 
slapd stop) then run sync(8) (I assume this does something for ZFS too) 
and it still comes back as hosed if I reboot shortly after.  If I 
start/stop slapd it's fine.  So I feel like there is an FS/dismount 
thing going on here.

Additional information: I also have some boxes which will reboot (ie; 
they don't freeze like some do at the end) but they don't dismount 
cleanly either and have to rebuild both GMIRROR and fsck.  This might be 
a different issue, too.

Anyone have any thoughts?  Let me know if I can provide more details etc.

--
Adam Strohl
http://www.ateamsystems.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to postmas...@multiplay.co.uk.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount

On Wed, Jun 19, 2013 at 06:35:57PM +0700, Adam Strohl wrote:
> Hello -STABLE@,
> 
> So I've seen this situation seemingly randomly on a number of both
> physical 9.1 boxes as well as VMs for I would say 6-9 months at
> least.  I finally have a physical box here that reproduces it
> consistently that I can reboot easily (ie; not a production/client
> server).
> 
> No matter what I do:
> 
> reboot
> shutdown -p
> shutdown -r
> 
> This specific server will stop at "All buffers synced" and not
> actually power down or reboot.  KB input seems to be ignored.  This
> server is a ZFS NAS (with GMIRROR for boot blocks) but the other
> boxes which show this are using GMIRRORs for root/swap/boot (no
> ZFS).
> 
> Here is what happens on the console: http://i.imgur.com/1H8JMyB.jpg
> 
> When I reset the server it appears that disks were not dismounted
> cleanly ... on this ZFS box it comes back quick because ZFS is good
> like that but on the other servers with GMIRROR roots rebuilding the
> GMIRROR and fscking at the same time is murder on the
> disk/performance until it finishes.

1. You mention "as well as VMs".  Anything under a "virtual machine" or
under a hypervisor is going to be very, very, **VERY** different than
bare metal.  So I hope the issues you're talking about above are on bare
metal -- I will assume so.

2. We need to know what version of "9.1" you're using, i.e. 9.1-RELEASE.
If you use stable/9 (RELENG_9) we need to see uname -a output (you can
hide the machine name if you want).

3. Can we please have dmesg from this machine?  The controller and some
other hardware details matter.

4. Does "sysctl hw.usb.no_shutdown_wait=1" help you?

5. Does "sysctl hw.acpi.handle_reboot=1" help you?

6. Does "sysctl hw.acpi.disable_on_reboot=1" help you?

7. If none of the above helps, can you please boot verbose mode and then
when the system "locks up" on "shutdown -r now" take a picture of the
VGA console?

8. Does the machine run moused(8) (check the process list please, do not
rely on rc.conf) ?

> Another interesting thing is that this particular server runs slapd
> (OpenLDAP) which, when it comes back up, has a "corrupted" DB
> (easily fixed with db_recover, but still).  This might be because FS
> commits aren't happening at the end.   I can even manually stop
> slapd (service slapd stop) then run sync(8) (I assume this does
> something for ZFS too) and it still comes back as hosed if I reboot
> shortly after.  If I start/stop slapd it's fine.  So I feel like
> there is an FS/dismount thing going on here.

sync(8) does not do what you think it does.  Please read (not skim) this
entire thread starting here:

http://lists.freebsd.org/pipermail/freebsd-fs/2013-April/thread.html#16982
http://lists.freebsd.org/pipermail/freebsd-fs/2013-April/016982.html

Your problem is related to unclean shutdown; fix that and your issues go
away.

> Additional information: I also have some boxes which will reboot
> (ie; they don't freeze like some do at the end) but they don't
> dismount cleanly either and have to rebuild both GMIRROR and fsck.
> This might be a different issue, too.

Every issue needs to be handled/treated separately.

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount


Hello -STABLE@,

So I've seen this situation seemingly randomly on a number of both 
physical 9.1 boxes as well as VMs for I would say 6-9 months at least. 
 I finally have a physical box here that reproduces it consistently 
that I can reboot easily (ie; not a production/client server).


No matter what I do:

reboot
shutdown -p
shutdown -r

This specific server will stop at "All buffers synced" and not actually 
power down or reboot.  KB input seems to be ignored.  This server is a 
ZFS NAS (with GMIRROR for boot blocks) but the other boxes which show 
this are using GMIRRORs for root/swap/boot (no ZFS).


Here is what happens on the console: http://i.imgur.com/1H8JMyB.jpg

When I reset the server it appears that disks were not dismounted 
cleanly ... on this ZFS box it comes back quick because ZFS is good like 
that but on the other servers with GMIRROR roots rebuilding the GMIRROR 
and fscking at the same time is murder on the disk/performance until it 
finishes.


Another interesting thing is that this particular server runs slapd 
(OpenLDAP) which, when it comes back up, has a "corrupted" DB (easily 
fixed with db_recover, but still).  This might be because FS commits 
aren't happening at the end.   I can even manually stop slapd (service 
slapd stop) then run sync(8) (I assume this does something for ZFS too) 
and it still comes back as hosed if I reboot shortly after.  If I 
start/stop slapd it's fine.  So I feel like there is an FS/dismount 
thing going on here.


Additional information: I also have some boxes which will reboot (ie; 
they don't freeze like some do at the end) but they don't dismount 
cleanly either and have to rebuild both GMIRROR and fsck.  This might be 
a different issue, too.


Anyone have any thoughts?  Let me know if I can provide more details etc.

--
Adam Strohl
http://www.ateamsystems.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-18 Thread Jeremy Chadwick

On Wed, Jun 19, 2013 at 01:41:10AM +0430, Javad Kouhi wrote:
> I've read the posts again. Although the issue looks same as Michiel
> Boland (first link) but I'm not sure if the root of the issue is same
> as Michiel's too (second link). Anyway, it should be discussed in
> another thread as you said.

Let me be more clear:

I have seen repeated reports from people complaining about "lockups when
shutting down" many times over the years.  The ones I remember:

- Certain oddities with SCSI/SATA storage drivers and disks (many of
  these have been fixed)
- ACPI-based reboot not working correctly on some motherboards
  (depends on hw.acpi.handle_reboot and sometimes
  hw.acpi.disable_on_reboot) -- not sure if this still pops up
- USB layer causing issues, or possibly some USB CAM integration
  problem (this is still an ongoing one)
- Now some sort of weird Intel graphics driver (and DRM?) quirk
  involving moused(8) and Vsync (the issue reported by Michiel)

And I'm certain I'm forgetting others.

What Kevin Oberman said also applies -- these are painful to debug
because the system is already in a "shutting down" state where usability
and accessibility becomes bare minimal, and you're kind of at your
wits end.

Booting verbose can help -- there are other messages printed to the VGA
(and/or serial) console during the shutdown phase when verbose.

All you can hope for is that the kernel is still alive and Ctrl-Alt-Esc
to force a drop to DDB (assuming all of this is enabled in your kernel)
works and that someone familiar with the FreeBSD kernel can help you
debug it (possibly it's just easier to do that, type "panic", then
issue "call doadump" to force a dump to swap at that point -- kib@
might have better recommendations).

Serial console can also greatly help, because quite often there are
pages upon pages of debugging information that are useful, otherwise you
have to hope the VGA console keyboard is functional (even more tricky
with USB) and that Scroll Lock + Page Up/Down function along with taking
photos of the screen; doing it this way is stressful and painful for
everyone involved.

I hope this sheds some light on why I said what I did.  :-)

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

I've read the posts again. Although the issue looks same as Michiel
Boland (first link) but I'm not sure if the root of the issue is same
as Michiel's too (second link). Anyway, it should be discussed in
another thread as you said.

> Second, the patch is not mine -- it's Konstantin's.  I did not write the
> code/fix, nor do I understand it.  All I did was provide a version of
> the same patch that applied cleanly on recent stable/9.  (I'm sorry for
> needing to state this, but clear ownership of code/issues is important.)

That's understandable, thank you both. I didn't mean it that way.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-18 Thread Jeremy Chadwick

On Tue, Jun 18, 2013 at 10:37:10PM +0430, Javad Kouhi wrote:
> On Tue, Jun 18, 2013 at 7:17 PM, Jeremy Chadwick  wrote:
> >
> > I do not use git, I use svn, So I cannot help you with git "crap".
> >
> > Please revert your sys/dev/drm2/i915/intel_fb.c and
> > sys/dev/syscons/scvgarndr.c back to r251934 (or newer) before following
> > what I tell you below.
> >
> > The problem is either that:
> >
> > - The patch you were given is probably for a different FreeBSD release,
> >   thus the code/line numbers/info in the code break the fuzzy logic
> >   matching,
> > - You copy-pasted the diff and because of tabs vs. spaces botched it,
> > - git apply/patch/whatever is weird,
> > - Multitudes of other possibilities I do not care to go into.
> >
> > The hack kib@ gave you is not hard to manually add yourself.  It's very
> > few lines of code.  I'm very surprised you didn't try to manually add it
> > yourself.  So I have done that for you.  First, the proof -- this is
> > against r251939, by the way, but that shouldn't matter as nobody has
> > touched this between r251934 and r251939:
> >
> > $ svn info
> > Path: .
> > Working Copy Root Path: /home/jdc/work/src
> > URL: svn://svn.freebsd.org/base/stable/9
> > Repository Root: svn://svn.freebsd.org/base
> > Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
> > Revision: 251939
> > Node Kind: directory
> > Schedule: normal
> > Last Changed Author: marius
> > Last Changed Rev: 251939
> > Last Changed Date: 2013-06-18 07:20:14 -0700 (Tue, 18 Jun 2013)
> >
> > $ svn status
> > M   sys/dev/drm2/i915/intel_fb.c
> > M   sys/dev/syscons/scvgarndr.c
> >
> > The diff itself is available here:
> >
> > http://jdc.koitsu.org/freebsd/sysmouse_vsync.diff
> >
> > I've also attached it here in Email (assuming the mailing list doesn't
> > delete it).
> >
> > You should apply the patch using:
> >
> >   cd /usr/src  (or wherever your source is)
> >   patch -p0 < sysmouse_vsync.diff
> >
> > Assuming use of svn, you can revert this patch by doing:
> >
> >   cd /usr/src  (or wherever your source is)
> >   svn revert sys/dev/drm2/i915/intel_fb.c
> >   svn revert sys/dev/syscons/scvgarndr.c
> >   rm sys/dev/drm2/i915/intel_fb.c.orig
> >   rm sys/dev/syscons/scvgarndr.c.orig
> >
> > There is probably some other "magical" way to do all of this, but as
> > anyone here knows, I do things manually because in general I do not
> > trust VCSes or the "magic" they do under the hood; I prefer to do things
> > that I know work.
> >
> > Good luck -- I cannot help with any other aspect to the issue.
> >
> > --
> > | Jeremy Chadwick   j...@koitsu.org |
> > | UNIX Systems Administratorhttp://jdc.koitsu.org/ |
> > | Making life hard for others since 1977. PGP 4BD6C0CB |
> >
> 
> Many thanks for the detailed answer. I've applied your patch and then
> rebuilt the world and kernel. To be honest, I tried to apply the patch
> manually but the syntax was too complex for me. Thanks for the help to
> apply the patch.
> 
> Unfortunately, the original issue is still exist and shutdown(8)
> doesn't work properly. I'm a newbie and I don't know what informations
> I should provide, but here is some basic information:
> 
> % uname -a
> FreeBSD minootux 9.1-STABLE FreeBSD 9.1-STABLE #0 r251946M: Tue Jun 18
> 21:16:56 IRDT 2013 root@minootux:/usr/obj/usr/src/sys/GIGABYTE
> amd64
> 
> % pkg_info -I -x xorg-server -x drm
> libdrm-2.4.44   Userspace interface to kernel Direct Rendering Module 
> servi
> xorg-server-1.12.4,1 X.Org X server and related programs
> 
> The machine is a laptop and the following link contains the details
> about the hardware:
> http://www.gigabyte.com/products/product-page.aspx?pid=3793#sp
> 
> KMS and NEW_XORG are enabled in my /etc/make.conf.

First, what makes you think your issue is the same issue as reported by
Michiel Boland?  Let me point you to two of his posts (read them slowly
and in full please):

http://lists.freebsd.org/pipermail/freebsd-stable/2013-June/073821.html

http://lists.freebsd.org/pipermail/freebsd-stable/2013-June/073839.html

Second, the patch is not mine -- it's Konstantin's.  I did not write the
code/fix, nor do I understand it.  All I did was provide a version of
the same patch that applied cleanly on recent stable/9.  (I'm sorry for
needing to state this, but c

Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

On Tue, Jun 18, 2013 at 7:17 PM, Jeremy Chadwick  wrote:
>
> I do not use git, I use svn, So I cannot help you with git "crap".
>
> Please revert your sys/dev/drm2/i915/intel_fb.c and
> sys/dev/syscons/scvgarndr.c back to r251934 (or newer) before following
> what I tell you below.
>
> The problem is either that:
>
> - The patch you were given is probably for a different FreeBSD release,
>   thus the code/line numbers/info in the code break the fuzzy logic
>   matching,
> - You copy-pasted the diff and because of tabs vs. spaces botched it,
> - git apply/patch/whatever is weird,
> - Multitudes of other possibilities I do not care to go into.
>
> The hack kib@ gave you is not hard to manually add yourself.  It's very
> few lines of code.  I'm very surprised you didn't try to manually add it
> yourself.  So I have done that for you.  First, the proof -- this is
> against r251939, by the way, but that shouldn't matter as nobody has
> touched this between r251934 and r251939:
>
> $ svn info
> Path: .
> Working Copy Root Path: /home/jdc/work/src
> URL: svn://svn.freebsd.org/base/stable/9
> Repository Root: svn://svn.freebsd.org/base
> Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
> Revision: 251939
> Node Kind: directory
> Schedule: normal
> Last Changed Author: marius
> Last Changed Rev: 251939
> Last Changed Date: 2013-06-18 07:20:14 -0700 (Tue, 18 Jun 2013)
>
> $ svn status
> M   sys/dev/drm2/i915/intel_fb.c
> M   sys/dev/syscons/scvgarndr.c
>
> The diff itself is available here:
>
> http://jdc.koitsu.org/freebsd/sysmouse_vsync.diff
>
> I've also attached it here in Email (assuming the mailing list doesn't
> delete it).
>
> You should apply the patch using:
>
>   cd /usr/src  (or wherever your source is)
>   patch -p0 < sysmouse_vsync.diff
>
> Assuming use of svn, you can revert this patch by doing:
>
>   cd /usr/src  (or wherever your source is)
>   svn revert sys/dev/drm2/i915/intel_fb.c
>   svn revert sys/dev/syscons/scvgarndr.c
>   rm sys/dev/drm2/i915/intel_fb.c.orig
>   rm sys/dev/syscons/scvgarndr.c.orig
>
> There is probably some other "magical" way to do all of this, but as
> anyone here knows, I do things manually because in general I do not
> trust VCSes or the "magic" they do under the hood; I prefer to do things
> that I know work.
>
> Good luck -- I cannot help with any other aspect to the issue.
>
> --
> | Jeremy Chadwick   j...@koitsu.org |
> | UNIX Systems Administratorhttp://jdc.koitsu.org/ |
> | Making life hard for others since 1977. PGP 4BD6C0CB |
>

Many thanks for the detailed answer. I've applied your patch and then
rebuilt the world and kernel. To be honest, I tried to apply the patch
manually but the syntax was too complex for me. Thanks for the help to
apply the patch.

Unfortunately, the original issue is still exist and shutdown(8)
doesn't work properly. I'm a newbie and I don't know what informations
I should provide, but here is some basic information:

% uname -a
FreeBSD minootux 9.1-STABLE FreeBSD 9.1-STABLE #0 r251946M: Tue Jun 18
21:16:56 IRDT 2013 root@minootux:/usr/obj/usr/src/sys/GIGABYTE
amd64

% pkg_info -I -x xorg-server -x drm
libdrm-2.4.44   Userspace interface to kernel Direct Rendering Module servi
xorg-server-1.12.4,1 X.Org X server and related programs

The machine is a laptop and the following link contains the details
about the hardware:
http://www.gigabyte.com/products/product-page.aspx?pid=3793#sp

KMS and NEW_XORG are enabled in my /etc/make.conf.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-18 Thread Kevin Oberman

On Tue, Jun 18, 2013 at 7:47 AM, Jeremy Chadwick  wrote:

> On Tue, Jun 18, 2013 at 07:00:30PM +0430, Javad Kouhi wrote:
> > Thanks for the reply, seems that our source trees are not same, I got
> this:
> >
> > % patch -p1 < /path/to/patch
> > Hmm...  Looks like a unified diff to me...
> > The text leading up to this was:
> > --
> > |diff --git a/sys/dev/drm2/i915/intel_fb.c b/sys/dev/drm2/i915/intel_fb.c
> > |index 3cb3b78..e41a49f 100644
> > |--- a/sys/dev/drm2/i915/intel_fb.c
> > |+++ b/sys/dev/drm2/i915/intel_fb.c
> > --
> > Patching file sys/dev/drm2/i915/intel_fb.c using Plan A...
> > Hunk #1 succeeded at 207 with fuzz 1.
> > Hunk #2 failed at 231.
> > 1 out of 2 hunks failed--saving rejects to
> sys/dev/drm2/i915/intel_fb.c.rej
> > Hmm...  The next patch looks like a unified diff to me...
> > The text leading up to this was:
> > --
> > |diff --git a/sys/dev/syscons/scvgarndr.c b/sys/dev/syscons/scvgarndr.c
> > |index 6e6663c..fc7f02f 100644
> > |--- a/sys/dev/syscons/scvgarndr.c
> > |+++ b/sys/dev/syscons/scvgarndr.c
> > --
> > Patching file sys/dev/syscons/scvgarndr.c using Plan A...
> > Hunk #1 succeeded at 395.
> > Hunk #2 failed at 447.
> > 1 out of 2 hunks failed--saving rejects to
> sys/dev/syscons/scvgarndr.c.rej
> > done
> >
> >
> > And the git way:
> >
> > % git apply /path/to/patch
> > error: patch failed: sys/dev/drm2/i915/intel_fb.c:207
> > error: sys/dev/drm2/i915/intel_fb.c: patch does not apply
> > error: patch failed: sys/dev/syscons/scvgarndr.c:445
> > error: sys/dev/syscons/scvgarndr.c: patch does not apply
> >
> >
> > I have revision 251934 of -STABLE branch. (I updated my source tree
> > about 3 hours ago using svn)
>
> I do not use git, I use svn, So I cannot help you with git "crap".
>
> Please revert your sys/dev/drm2/i915/intel_fb.c and
> sys/dev/syscons/scvgarndr.c back to r251934 (or newer) before following
> what I tell you below.
>
> The problem is either that:
>
> - The patch you were given is probably for a different FreeBSD release,
>   thus the code/line numbers/info in the code break the fuzzy logic
>   matching,
> - You copy-pasted the diff and because of tabs vs. spaces botched it,
> - git apply/patch/whatever is weird,
> - Multitudes of other possibilities I do not care to go into.
>
> The hack kib@ gave you is not hard to manually add yourself.  It's very
> few lines of code.  I'm very surprised you didn't try to manually add it
> yourself.  So I have done that for you.  First, the proof -- this is
> against r251939, by the way, but that shouldn't matter as nobody has
> touched this between r251934 and r251939:
>
> $ svn info
> Path: .
> Working Copy Root Path: /home/jdc/work/src
> URL: svn://svn.freebsd.org/base/stable/9
> Repository Root: svn://svn.freebsd.org/base
> Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
> Revision: 251939
> Node Kind: directory
> Schedule: normal
> Last Changed Author: marius
> Last Changed Rev: 251939
> Last Changed Date: 2013-06-18 07:20:14 -0700 (Tue, 18 Jun 2013)
>
> $ svn status
> M   sys/dev/drm2/i915/intel_fb.c
> M   sys/dev/syscons/scvgarndr.c
>
> The diff itself is available here:
>
> http://jdc.koitsu.org/freebsd/sysmouse_vsync.diff
>
> I've also attached it here in Email (assuming the mailing list doesn't
> delete it).
>
> You should apply the patch using:
>
>   cd /usr/src  (or wherever your source is)
>   patch -p0 < sysmouse_vsync.diff
>
> Assuming use of svn, you can revert this patch by doing:
>
>   cd /usr/src  (or wherever your source is)
>   svn revert sys/dev/drm2/i915/intel_fb.c
>   svn revert sys/dev/syscons/scvgarndr.c
>   rm sys/dev/drm2/i915/intel_fb.c.orig
>   rm sys/dev/syscons/scvgarndr.c.orig
>
> There is probably some other "magical" way to do all of this, but as
> anyone here knows, I do things manually because in general I do not
> trust VCSes or the "magic" they do under the hood; I prefer to do things
> that I know work.
>
> Good luck -- I cannot help with any other aspect to the issue.
>

After some testing, I now believe that there are two failure modes causing
hangs on shutdown on Intel_KMS systems. Data is mostly empirical, but it
looks pretty clear to me.

Failure 1: Shutdown proceeds through a significant portion of the shutdown
before hanging shortly before syncing disks. Shutdown has passed the point
of most userland capability, so no real access is available.

Failure2:

Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-18 Thread Jeremy Chadwick

On Tue, Jun 18, 2013 at 07:00:30PM +0430, Javad Kouhi wrote:
> Thanks for the reply, seems that our source trees are not same, I got this:
> 
> % patch -p1 < /path/to/patch
> Hmm...  Looks like a unified diff to me...
> The text leading up to this was:
> --
> |diff --git a/sys/dev/drm2/i915/intel_fb.c b/sys/dev/drm2/i915/intel_fb.c
> |index 3cb3b78..e41a49f 100644
> |--- a/sys/dev/drm2/i915/intel_fb.c
> |+++ b/sys/dev/drm2/i915/intel_fb.c
> --
> Patching file sys/dev/drm2/i915/intel_fb.c using Plan A...
> Hunk #1 succeeded at 207 with fuzz 1.
> Hunk #2 failed at 231.
> 1 out of 2 hunks failed--saving rejects to sys/dev/drm2/i915/intel_fb.c.rej
> Hmm...  The next patch looks like a unified diff to me...
> The text leading up to this was:
> --
> |diff --git a/sys/dev/syscons/scvgarndr.c b/sys/dev/syscons/scvgarndr.c
> |index 6e6663c..fc7f02f 100644
> |--- a/sys/dev/syscons/scvgarndr.c
> |+++ b/sys/dev/syscons/scvgarndr.c
> --
> Patching file sys/dev/syscons/scvgarndr.c using Plan A...
> Hunk #1 succeeded at 395.
> Hunk #2 failed at 447.
> 1 out of 2 hunks failed--saving rejects to sys/dev/syscons/scvgarndr.c.rej
> done
> 
> 
> And the git way:
> 
> % git apply /path/to/patch
> error: patch failed: sys/dev/drm2/i915/intel_fb.c:207
> error: sys/dev/drm2/i915/intel_fb.c: patch does not apply
> error: patch failed: sys/dev/syscons/scvgarndr.c:445
> error: sys/dev/syscons/scvgarndr.c: patch does not apply
> 
> 
> I have revision 251934 of -STABLE branch. (I updated my source tree
> about 3 hours ago using svn)

I do not use git, I use svn, So I cannot help you with git "crap".

Please revert your sys/dev/drm2/i915/intel_fb.c and
sys/dev/syscons/scvgarndr.c back to r251934 (or newer) before following
what I tell you below.

The problem is either that:

- The patch you were given is probably for a different FreeBSD release,
  thus the code/line numbers/info in the code break the fuzzy logic
  matching,
- You copy-pasted the diff and because of tabs vs. spaces botched it,
- git apply/patch/whatever is weird,
- Multitudes of other possibilities I do not care to go into.

The hack kib@ gave you is not hard to manually add yourself.  It's very
few lines of code.  I'm very surprised you didn't try to manually add it
yourself.  So I have done that for you.  First, the proof -- this is
against r251939, by the way, but that shouldn't matter as nobody has
touched this between r251934 and r251939:

$ svn info
Path: .
Working Copy Root Path: /home/jdc/work/src
URL: svn://svn.freebsd.org/base/stable/9
Repository Root: svn://svn.freebsd.org/base
Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
Revision: 251939
Node Kind: directory
Schedule: normal
Last Changed Author: marius
Last Changed Rev: 251939
Last Changed Date: 2013-06-18 07:20:14 -0700 (Tue, 18 Jun 2013)

$ svn status
M   sys/dev/drm2/i915/intel_fb.c
M   sys/dev/syscons/scvgarndr.c

The diff itself is available here:

http://jdc.koitsu.org/freebsd/sysmouse_vsync.diff

I've also attached it here in Email (assuming the mailing list doesn't
delete it).

You should apply the patch using:

  cd /usr/src  (or wherever your source is)
  patch -p0 < sysmouse_vsync.diff

Assuming use of svn, you can revert this patch by doing:

  cd /usr/src  (or wherever your source is)
  svn revert sys/dev/drm2/i915/intel_fb.c
  svn revert sys/dev/syscons/scvgarndr.c
  rm sys/dev/drm2/i915/intel_fb.c.orig
  rm sys/dev/syscons/scvgarndr.c.orig

There is probably some other "magical" way to do all of this, but as
anyone here knows, I do things manually because in general I do not
trust VCSes or the "magic" they do under the hood; I prefer to do things
that I know work.

Good luck -- I cannot help with any other aspect to the issue.

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Making life hard for others since 1977. PGP 4BD6C0CB |

Index: sys/dev/drm2/i915/intel_fb.c
===
--- sys/dev/drm2/i915/intel_fb.c	(revision 251939)
+++ sys/dev/drm2/i915/intel_fb.c	(working copy)
@@ -207,6 +207,8 @@ static void intel_fbdev_destroy(struct drm_device
 	}
 }

+extern int sc_txtmouse_no_retrace_wait;
+
 int intel_fbdev_init(struct drm_device *dev)
 {
 	struct intel_fbdev *ifbdev;
@@ -229,6 +231,7 @@ int intel_fbdev_init(struct drm_device *dev)

 	drm_fb_helper_single_add_all_connectors(&ifbdev->helper);
 	drm_fb_helper_initial_config(&ifbdev->helper, 32);
+	sc_txtmouse_no_retrace_wait = 1;
 	return 0;
 }

Index: sys/dev/syscons/scvgarndr.c
===
--- sys/dev/syscons/scvgarndr.c	(revision 251939)
+++ sys/dev/syscons/scvgarndr.c	(working copy)
@@ -395,6 +395,8 @@ vga_txtblink(scr_stat *scp, int at, int flip)
 {
 }

+int sc_txtmouse_no_retrace_wait;
+
 #ifndef SC_NO_CUTP

Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

Thanks for the reply, seems that our source trees are not same, I got this:

% patch -p1 < /path/to/patch
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--
|diff --git a/sys/dev/drm2/i915/intel_fb.c b/sys/dev/drm2/i915/intel_fb.c
|index 3cb3b78..e41a49f 100644
|--- a/sys/dev/drm2/i915/intel_fb.c
|+++ b/sys/dev/drm2/i915/intel_fb.c
--
Patching file sys/dev/drm2/i915/intel_fb.c using Plan A...
Hunk #1 succeeded at 207 with fuzz 1.
Hunk #2 failed at 231.
1 out of 2 hunks failed--saving rejects to sys/dev/drm2/i915/intel_fb.c.rej
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--
|diff --git a/sys/dev/syscons/scvgarndr.c b/sys/dev/syscons/scvgarndr.c
|index 6e6663c..fc7f02f 100644
|--- a/sys/dev/syscons/scvgarndr.c
|+++ b/sys/dev/syscons/scvgarndr.c
--
Patching file sys/dev/syscons/scvgarndr.c using Plan A...
Hunk #1 succeeded at 395.
Hunk #2 failed at 447.
1 out of 2 hunks failed--saving rejects to sys/dev/syscons/scvgarndr.c.rej
done


And the git way:

% git apply /path/to/patch
error: patch failed: sys/dev/drm2/i915/intel_fb.c:207
error: sys/dev/drm2/i915/intel_fb.c: patch does not apply
error: patch failed: sys/dev/syscons/scvgarndr.c:445
error: sys/dev/syscons/scvgarndr.c: patch does not apply


I have revision 251934 of -STABLE branch. (I updated my source tree
about 3 hours ago using svn)
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-18 Thread Michiel Boland


On 06/18/2013 13:28, Javad Kouhi wrote:

Hi,

I have exactly the same problem. I've tried to apply the above patch
but it refused. I've checked out  the last revision (251934) of
-STABLE branch using Subversion.

% git apply --check patch
error: patch failed: sys/dev/drm2/i915/intel_fb.c:207
error: sys/dev/drm2/i915/intel_fb.c: patch does not apply
error: patch failed: sys/dev/syscons/scvgarndr.c:445
error: sys/dev/syscons/scvgarndr.c: patch does not apply

How can I apply this patch?


I think you want to lose the '--check' option here.

Cheers
Michiel

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

Hi,

I have exactly the same problem. I've tried to apply the above patch
but it refused. I've checked out  the last revision (251934) of
-STABLE branch using Subversion.

% git apply --check patch
error: patch failed: sys/dev/drm2/i915/intel_fb.c:207
error: sys/dev/drm2/i915/intel_fb.c: patch does not apply
error: patch failed: sys/dev/syscons/scvgarndr.c:445
error: sys/dev/syscons/scvgarndr.c: patch does not apply

How can I apply this patch?
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-17 Thread Konstantin Belousov

On Mon, Jun 17, 2013 at 09:16:56PM +0200, Michiel Boland wrote:
> On 06/16/2013 17:11, Michiel Boland wrote:
> > Hi. Recently I switched to WITH_NEW_XORG, primarily because the stock X 
> > server
> > with Intel driver has some issues that make it unusable for me.
> >
> > The new X server and Intel driver works extremely well, so kudos to whoever 
> > made
> > this possible.
> >
> > Unfortunately, I am now experiencing random hangs on shutdown. On shutdown 
> > the
> > system randomly freezes after
> >
> > [...] syslogd: exiting on signal 15
> >
> > I would then expect to see 'Waiting (max 60 seconds) for system process 
> > 'XXX' to
> > stop messages, but these never arrive.
> 
> So it turns out that init hangs because vga_txtmouse (draw_txtmouse in fact) 
> is 
> hogging the clock swi. The routine is waiting for a vertical retrace which 
> never 
> arrives. (The new intel driver can't return to text console, so the screen 
> just 
> goes blank when X exits.)
> 
> Some workarounds:
> 
> - don't run moused (i.e. disable it in rc.conf and devd.conf)
>instead run the X server in combination with hald
> 
> - do run moused, but then either
> 
>   - unplug the mouse before shutting down
> 
>- build a kernel with VGA_NO_FONT_LOADING
> 
> Of course the long-term fix is to remove the possibly infinite loop in 
> draw_txtmouse.
> 
> Thanks to Konstantin for his patience in helping me track this down.

The following patch, although a hack, should fix the issue.
Michiel tested it.

diff --git a/sys/dev/drm2/i915/intel_fb.c b/sys/dev/drm2/i915/intel_fb.c
index 3cb3b78..e41a49f 100644
--- a/sys/dev/drm2/i915/intel_fb.c
+++ b/sys/dev/drm2/i915/intel_fb.c
@@ -207,6 +207,8 @@ static void intel_fbdev_destroy(struct drm_device *dev,
}
 }
 
+extern int sc_txtmouse_no_retrace_wait;
+
 int intel_fbdev_init(struct drm_device *dev)
 {
struct intel_fbdev *ifbdev;
@@ -229,6 +231,7 @@ int intel_fbdev_init(struct drm_device *dev)
 
drm_fb_helper_single_add_all_connectors(&ifbdev->helper);
drm_fb_helper_initial_config(&ifbdev->helper, 32);
+   sc_txtmouse_no_retrace_wait = 1;
return 0;
 }
 
diff --git a/sys/dev/syscons/scvgarndr.c b/sys/dev/syscons/scvgarndr.c
index 6e6663c..fc7f02f 100644
--- a/sys/dev/syscons/scvgarndr.c
+++ b/sys/dev/syscons/scvgarndr.c
@@ -395,6 +395,8 @@ vga_txtblink(scr_stat *scp, int at, int flip)
 {
 }
 
+int sc_txtmouse_no_retrace_wait;
+
 #ifndef SC_NO_CUTPASTE
 
 static void
@@ -445,7 +447,9 @@ draw_txtmouse(scr_stat *scp, int x, int y)
 #if 1
/* wait for vertical retrace to avoid jitter on some videocards */
crtc_addr = scp->sc->adp->va_crtc_addr;
-   while (!(inb(crtc_addr + 6) & 0x08)) /* idle */ ;
+   while (!sc_txtmouse_no_retrace_wait &&
+   !(inb(crtc_addr + 6) & 0x08))
+   /* idle */ ;
 #endif
c = scp->sc->mouse_char;
vidd_load_font(scp->sc->adp, 0, 32, 8, font_buf, c, 4); 


pgpxVLvIhVDpR.pgp
Description: PGP signature

Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-17 Thread Michiel Boland


On 06/16/2013 17:11, Michiel Boland wrote:

Hi. Recently I switched to WITH_NEW_XORG, primarily because the stock X server
with Intel driver has some issues that make it unusable for me.

The new X server and Intel driver works extremely well, so kudos to whoever made
this possible.

Unfortunately, I am now experiencing random hangs on shutdown. On shutdown the
system randomly freezes after

[...] syslogd: exiting on signal 15

I would then expect to see 'Waiting (max 60 seconds) for system process 'XXX' to
stop messages, but these never arrive.


So it turns out that init hangs because vga_txtmouse (draw_txtmouse in fact) is 
hogging the clock swi. The routine is waiting for a vertical retrace which never 
arrives. (The new intel driver can't return to text console, so the screen just 
goes blank when X exits.)


Some workarounds:

- don't run moused (i.e. disable it in rc.conf and devd.conf)
  instead run the X server in combination with hald

- do run moused, but then either

 - unplug the mouse before shutting down

  - build a kernel with VGA_NO_FONT_LOADING

Of course the long-term fix is to remove the possibly infinite loop in 
draw_txtmouse.


Thanks to Konstantin for his patience in helping me track this down.

Cheers
Michiel

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG


So apparently the value of 'rebooting' is 0 at the time of the hang...

db> x rebooting
rebooting:  0

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-16 Thread Konstantin Belousov

On Sun, Jun 16, 2013 at 08:06:21PM +0200, Michiel Boland wrote:
> On 06/16/2013 19:46, Konstantin Belousov wrote:
> > On Sun, Jun 16, 2013 at 07:12:33PM +0200, Michiel Boland wrote:
> >> On 06/16/2013 17:37, Konstantin Belousov wrote:
> >> [...]
> >>> I do not see anything related to i915 in the core.txt you provided.
> >>>
> >>> Next time the machine hangs, start with the output of ps command from
> >>> ddb and 'show allpcpu', together with 'alltrace'.
> >>>
> >>
> >> Ok, I captured 'ps', 'show allpcpu' and 'alltrace' from a stuck shutdown. 
> >> I've
> >> appended it to my core.txt. (See previous e-mail.) (Note that the ddb 
> >> commands
> >> are from a different session - so the ddb output may not match with the 
> >> kgdb
> >> output.)
> >>
> >
> > Hm, how do you initiate the shutdown ? Show the exact command.
> > Also, from the same moment of the hung system, enter the ddb and
> > again do ps, alltrace and 'x rebooting'.
> >
> 
> The exact command to generate the hangs from which I created the reports was
> 
> 'shutdown -r now'
> 
> FWIW - the saved core from the ddb-induced panic has
> 
> (kgdb) print rebooting
> $1 = 1
I explicitely asked you to provide me with the consistent ps/alltrace
and 'x rebooting' output.  What you did is useless.

In the ddb trace you appended, there is no thread which executes
the reboot(2) system call.


pgpxBIhY0i5UK.pgp
Description: PGP signature

Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-16 Thread Ian Lepore

On Sun, 2013-06-16 at 09:07 -0700, Jeremy Chadwick wrote:
> On Sun, Jun 16, 2013 at 06:01:49PM +0200, Michiel Boland wrote:
> > On 06/16/2013 17:55, Jeremy Chadwick wrote:
> > [...]
> > 
> > >Are you running moused(8)?  Actually, I can see quite clearly that you
> > >are in your core.txt:
> > >
> > >Starting ums0 moused.
> > >
> > >Try turning that off.  Don't ask me how, because devd(8) / devd.conf(5)
> > >might be involved.
> > >
> > 
> > The moused is started by devd - I don't see a quick way of turning that off.
> 
> Comment out the relevant crap in devd.conf(5).  Search for "ums"
> and comment out the two "notify" sections.

I don't understand why people treat devd as if it's some sort of evil
virus that they're forced to live with (using phrases like "crap in
devd.conf").  In general, the standard devd rules tend to fall into 3
categories:  
  * use logger(1) to record some anomaly
  * kldload a module
  * invoke a standard /etc/rc.d script

For moused, the devd rules invoke /etc/rc.d/moused, which implies that
setting moused_enable=NO in rc.conf would be all that's needed to
disable it.

-- Ian

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG


On 06/16/2013 20:06, Michiel Boland wrote:

FWIW - the saved core from the ddb-induced panic has

(kgdb) print rebooting
$1 = 1



I realised instantly after I sent my message that this is meaningless - so 
please ignore that.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG


On 06/16/2013 19:46, Konstantin Belousov wrote:

On Sun, Jun 16, 2013 at 07:12:33PM +0200, Michiel Boland wrote:

On 06/16/2013 17:37, Konstantin Belousov wrote:
[...]

I do not see anything related to i915 in the core.txt you provided.

Next time the machine hangs, start with the output of ps command from
ddb and 'show allpcpu', together with 'alltrace'.



Ok, I captured 'ps', 'show allpcpu' and 'alltrace' from a stuck shutdown. I've
appended it to my core.txt. (See previous e-mail.) (Note that the ddb commands
are from a different session - so the ddb output may not match with the kgdb
output.)



Hm, how do you initiate the shutdown ? Show the exact command.
Also, from the same moment of the hung system, enter the ddb and
again do ps, alltrace and 'x rebooting'.



The exact command to generate the hangs from which I created the reports was

'shutdown -r now'

FWIW - the saved core from the ddb-induced panic has

(kgdb) print rebooting
$1 = 1

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-16 Thread Konstantin Belousov

On Sun, Jun 16, 2013 at 07:12:33PM +0200, Michiel Boland wrote:
> On 06/16/2013 17:37, Konstantin Belousov wrote:
> [...]
> > I do not see anything related to i915 in the core.txt you provided.
> >
> > Next time the machine hangs, start with the output of ps command from
> > ddb and 'show allpcpu', together with 'alltrace'.
> >
> 
> Ok, I captured 'ps', 'show allpcpu' and 'alltrace' from a stuck shutdown. 
> I've 
> appended it to my core.txt. (See previous e-mail.) (Note that the ddb 
> commands 
> are from a different session - so the ddb output may not match with the kgdb 
> output.)
> 

Hm, how do you initiate the shutdown ? Show the exact command.
Also, from the same moment of the hung system, enter the ddb and
again do ps, alltrace and 'x rebooting'.


pgpFfv8UYSLYj.pgp
Description: PGP signature

Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG


On 06/16/2013 17:37, Konstantin Belousov wrote:
[...]

I do not see anything related to i915 in the core.txt you provided.

Next time the machine hangs, start with the output of ps command from
ddb and 'show allpcpu', together with 'alltrace'.



Ok, I captured 'ps', 'show allpcpu' and 'alltrace' from a stuck shutdown. I've 
appended it to my core.txt. (See previous e-mail.) (Note that the ddb commands 
are from a different session - so the ddb output may not match with the kgdb 
output.)



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-16 Thread Steven Hartland

- Original Message - 
From: "Michiel Boland" 

To: "FreeBSD Stable" 
Sent: Sunday, June 16, 2013 4:11 PM
Subject: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

Hi. Recently I switched to WITH_NEW_XORG, primarily because the stock X server 
with Intel driver has some issues that make it unusable for me.

The new X server and Intel driver works extremely well, so kudos to whoever made 
this possible.

Unfortunately, I am now experiencing random hangs on shutdown. On shutdown the 
system randomly freezes after

[...] syslogd: exiting on signal 15

I would then expect to see 'Waiting (max 60 seconds) for system process 'XXX' to 
stop messages, but these never arrive.

I paniced the machine in ddb, so I have a crash dump if someone want to look at 
it. The crashinfo is at http://barrytown.boland.org/core.txt (I would have 
pasted it here but it is a bit verbose.)

Machine has an Intel G41 chipset, with a SAMSUNG SSD 830 Series HD, running 
9.1-STABLE r251803. Serial console. GENERIC kernel, expect for options DDB and 
ALT_BREAK_TO_DEBUGGER.

Who knows what's going on here?

Does setting the sysctl: hw.usb.no_shutdown_wait=1 help?

   Regards
   steve

This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to postmas...@multiplay.co.uk.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-16 Thread Jeremy Chadwick

On Sun, Jun 16, 2013 at 06:01:49PM +0200, Michiel Boland wrote:
> On 06/16/2013 17:55, Jeremy Chadwick wrote:
> [...]
> 
> >Are you running moused(8)?  Actually, I can see quite clearly that you
> >are in your core.txt:
> >
> >Starting ums0 moused.
> >
> >Try turning that off.  Don't ask me how, because devd(8) / devd.conf(5)
> >might be involved.
> >
> 
> The moused is started by devd - I don't see a quick way of turning that off.

Comment out the relevant crap in devd.conf(5).  Search for "ums"
and comment out the two "notify" sections.

> As a workaround I'm trying to run a kernel with
> 
>  options SC_NO_SYSMOUSE
> 
> to see if the hangs go away.

That's one way to do it, I guess.

Be aware that I do not use X, however I have repeatedly seen mentioned
on these lists problems/complexities from where people rely on moused(8)
to "drive their mouse" while inside of X (or possibly that X and
moused(8) are both simultaneously polling the mouse).  There's
apparently a very specific kind of X configuration you're supposed to
use to get proper mouse/keyboard/HAL/HID/whatever support, and tons of
people have it wrongt.  Warren Block I think has some insights into
this, or could maybe help shed some light on what I'm remembering.

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG


On 06/16/2013 17:55, Jeremy Chadwick wrote:
[...]


Are you running moused(8)?  Actually, I can see quite clearly that you
are in your core.txt:

Starting ums0 moused.

Try turning that off.  Don't ask me how, because devd(8) / devd.conf(5)
might be involved.



The moused is started by devd - I don't see a quick way of turning that off.

As a workaround I'm trying to run a kernel with

 options SC_NO_SYSMOUSE

to see if the hangs go away.

Cheers
Michiel

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-16 Thread Jeremy Chadwick

On Sun, Jun 16, 2013 at 05:48:52PM +0200, Michiel Boland wrote:
> On 06/16/2013 17:37, Konstantin Belousov wrote:
> >On Sun, Jun 16, 2013 at 05:11:15PM +0200, Michiel Boland wrote:
> >>Hi. Recently I switched to WITH_NEW_XORG, primarily because the stock X 
> >>server
> >>with Intel driver has some issues that make it unusable for me.
> >>
> >>The new X server and Intel driver works extremely well, so kudos to whoever 
> >>made
> >>this possible.
> >>
> >>Unfortunately, I am now experiencing random hangs on shutdown. On shutdown 
> >>the
> >>system randomly freezes after
> >>
> >>[...] syslogd: exiting on signal 15
> >>
> >>I would then expect to see 'Waiting (max 60 seconds) for system process 
> >>'XXX' to
> >>stop messages, but these never arrive.
> >>
> >>I paniced the machine in ddb, so I have a crash dump if someone want to 
> >>look at
> >>it. The crashinfo is at http://barrytown.boland.org/core.txt (I would have
> >>pasted it here but it is a bit verbose.)
> >>
> >>Machine has an Intel G41 chipset, with a SAMSUNG SSD 830 Series HD, running
> >>9.1-STABLE r251803. Serial console. GENERIC kernel, expect for options DDB 
> >>and
> >>ALT_BREAK_TO_DEBUGGER.
> >>
> >>Who knows what's going on here?
> >
> >I do not see anything related to i915 in the core.txt you provided.
> >
> >Next time the machine hangs, start with the output of ps command from
> >ddb and 'show allpcpu', together with 'alltrace'.
> >
> 
> Ok.
> 
> I appended 'thread apply all bt' from kgdb to the core.txt, maybe
> there is something interesting in there.
> 
> I did notice the following
> 
> Thread 17 (Thread 17):
> #0  cpustop_handler () at /usr/src/sys/amd64/amd64/mp_machdep.c:1392
> #1  0x80cbebbd in ipi_nmi_handler () at
> /usr/src/sys/amd64/amd64/mp_machdep.c:1374
> #2  0x80ccc159 in trap (frame=0x81424890) at
> /usr/src/sys/amd64/amd64/trap.c:211
> #3  0x80cb55af in nmi_calltrap () at
> /usr/src/sys/amd64/amd64/exception.S:501
> #4  0x80d0c029 in vga_txtmouse (scp=0xfe0005586600,
> x=320, y=200, on=) at cpufunc.h:186
> Previous frame inner to this frame (corrupt stack?)
> 
> Maybe the hang is caused by the removal of the text mouse cursor?
> (Just guessing here.)

vga_txtmouse comes from syscons(4).

Are you making use of vidcontrol(1) in any way to set the system console
(outside of X) to something that uses the VGA framebuffer?  There are
probably some loader.conf or rc.conf variables that control this (I do
not know).

Are you running moused(8)?  Actually, I can see quite clearly that you
are in your core.txt:

Starting ums0 moused.

Try turning that off.  Don't ask me how, because devd(8) / devd.conf(5)
might be involved.

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG


On 06/16/2013 17:37, Konstantin Belousov wrote:

On Sun, Jun 16, 2013 at 05:11:15PM +0200, Michiel Boland wrote:

Hi. Recently I switched to WITH_NEW_XORG, primarily because the stock X server
with Intel driver has some issues that make it unusable for me.

The new X server and Intel driver works extremely well, so kudos to whoever made
this possible.

Unfortunately, I am now experiencing random hangs on shutdown. On shutdown the
system randomly freezes after

[...] syslogd: exiting on signal 15

I would then expect to see 'Waiting (max 60 seconds) for system process 'XXX' to
stop messages, but these never arrive.

I paniced the machine in ddb, so I have a crash dump if someone want to look at
it. The crashinfo is at http://barrytown.boland.org/core.txt (I would have
pasted it here but it is a bit verbose.)

Machine has an Intel G41 chipset, with a SAMSUNG SSD 830 Series HD, running
9.1-STABLE r251803. Serial console. GENERIC kernel, expect for options DDB and
ALT_BREAK_TO_DEBUGGER.

Who knows what's going on here?


I do not see anything related to i915 in the core.txt you provided.

Next time the machine hangs, start with the output of ps command from
ddb and 'show allpcpu', together with 'alltrace'.



Ok.

I appended 'thread apply all bt' from kgdb to the core.txt, maybe there is 
something interesting in there.


I did notice the following

Thread 17 (Thread 17):
#0  cpustop_handler () at /usr/src/sys/amd64/amd64/mp_machdep.c:1392
#1  0x80cbebbd in ipi_nmi_handler () at 
/usr/src/sys/amd64/amd64/mp_machdep.c:1374
#2  0x80ccc159 in trap (frame=0x81424890) at 
/usr/src/sys/amd64/amd64/trap.c:211
#3  0x80cb55af in nmi_calltrap () at 
/usr/src/sys/amd64/amd64/exception.S:501
#4  0x80d0c029 in vga_txtmouse (scp=0xfe0005586600, x=320, y=200, 
on=) at cpufunc.h:186

Previous frame inner to this frame (corrupt stack?)

Maybe the hang is caused by the removal of the text mouse cursor? (Just guessing 
here.)


Cheers
Michiel

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: system sporadically hangs on shutdown after switching to WITH_NEW_XORG

2013-06-16 Thread Konstantin Belousov

On Sun, Jun 16, 2013 at 05:11:15PM +0200, Michiel Boland wrote:
> Hi. Recently I switched to WITH_NEW_XORG, primarily because the stock X 
> server 
> with Intel driver has some issues that make it unusable for me.
> 
> The new X server and Intel driver works extremely well, so kudos to whoever 
> made 
> this possible.
> 
> Unfortunately, I am now experiencing random hangs on shutdown. On shutdown 
> the 
> system randomly freezes after
> 
> [...] syslogd: exiting on signal 15
> 
> I would then expect to see 'Waiting (max 60 seconds) for system process 'XXX' 
> to 
> stop messages, but these never arrive.
> 
> I paniced the machine in ddb, so I have a crash dump if someone want to look 
> at 
> it. The crashinfo is at http://barrytown.boland.org/core.txt (I would have 
> pasted it here but it is a bit verbose.)
> 
> Machine has an Intel G41 chipset, with a SAMSUNG SSD 830 Series HD, running 
> 9.1-STABLE r251803. Serial console. GENERIC kernel, expect for options DDB 
> and 
> ALT_BREAK_TO_DEBUGGER.
> 
> Who knows what's going on here?

I do not see anything related to i915 in the core.txt you provided.

Next time the machine hangs, start with the output of ps command from
ddb and 'show allpcpu', together with 'alltrace'.


pgpqYPumBgaGi.pgp
Description: PGP signature

system sporadically hangs on shutdown after switching to WITH_NEW_XORG