Re: [CentOS] Intermittent problem, likely disk IO related - mptscsih: ioc0: attempting task abort!

2015-02-17 Thread Chris Murphy
On Tue, Feb 17, 2015 at 10:02 PM, Jason Pyeron wrote: > I can say, we have about 20 of the identical systems, doing the same work. > PE2970 running RHEL6/Centos6 and libvirtd 20 other identical systems doing the same work strongly suggests hardware problem when there's a single outlier. > >> I

Re: [CentOS] Intermittent problem, likely disk IO related - mptscsih: ioc0: attempting task abort!

2015-02-17 Thread Jason Pyeron
> -Original Message- > From: Chris Murphy > Sent: Tuesday, February 17, 2015 23:38 > > On Tue, Feb 17, 2015 at 7:34 PM, Jason Pyeron wrote: > >> -Original Message- > >> From: Chris Murphy > >> Sent: Tuesday, February 17, 2015 20:48 > >> > >> On Tue, Feb 17, 2015 at 7:54 AM, Jason P

Re: [CentOS] Intermittent problem, likely disk IO related - mptscsih: ioc0: attempting task abort!

2015-02-17 Thread Chris Murphy
On Tue, Feb 17, 2015 at 7:34 PM, Jason Pyeron wrote: >> -Original Message- >> From: Chris Murphy >> Sent: Tuesday, February 17, 2015 20:48 >> >> On Tue, Feb 17, 2015 at 7:54 AM, Jason Pyeron wrote: >> >> I'd post the entire dmesg somewhere >> > >> > http://client.pdinc.us/panic-341e97c30b5

Re: [CentOS] Intermittent problem, likely disk IO related - mptscsih: ioc0: attempting task abort!

2015-02-17 Thread Jason Pyeron
> -Original Message- > From: Chris Murphy > Sent: Tuesday, February 17, 2015 20:48 > > On Tue, Feb 17, 2015 at 7:54 AM, Jason Pyeron wrote: > >> I'd post the entire dmesg somewhere > > > > http://client.pdinc.us/panic-341e97c30b5a4cb774942bae32d3f163.log > > At least part of the problem h

Re: [CentOS] Intermittent problem, likely disk IO related - mptscsih: ioc0: attempting task abort!

2015-02-17 Thread Chris Murphy
On Tue, Feb 17, 2015 at 7:54 AM, Jason Pyeron wrote: >> I'd post the entire dmesg somewhere > > http://client.pdinc.us/panic-341e97c30b5a4cb774942bae32d3f163.log At least part of the problem happens before this log starts. >> What do you get for >> smartctl -x > > http://client.pdinc.us/smartct

Re: [CentOS] Intermittent problem, likely disk IO related - mptscsih: ioc0: attempting task abort!

2015-02-17 Thread Jason Pyeron
> -Original Message- > From: Chris Murphy > Sent: Tuesday, February 17, 2015 3:58 > > I think the panic is the consequence of drive write failure. > So the actual > problem is before the panic call trace. Most of the time it panics without any warning, but once there was: > > -Orig

Re: [CentOS] Intermittent problem, likely disk IO related - mptscsih: ioc0: attempting task abort!

2015-02-17 Thread Chris Murphy
I think the panic is the consequence of drive write failure. So the actual problem is before the panic call trace. I'd post the entire dmesg somewhere wrap safe (either you mail agent or the forum is hard wrapping and is a pain to read). What do you get for smartctl -x In the meantime check or r

Re: [CentOS] Intermittent problem, likely disk IO related - mptscsih: ioc0: attempting task abort!

2015-02-16 Thread Jason Pyeron
> -Original Message- > From: Jason Pyeron > Sent: Sunday, February 08, 2015 0:00 > > > -Original Message- > > From: Jason Pyeron > > Sent: Saturday, February 07, 2015 22:54 > > > > NOTE: this is happening on Centos 6 x86_64, > > 2.6.32-504.3.3.el6.x86_64 not Centos 5 > > > > Del

Re: [CentOS] Intermittent problem, likely disk IO related - mptscsih: ioc0: attempting task abort!

2015-02-07 Thread Jason Pyeron
> -Original Message- > From: Jason Pyeron > Sent: Saturday, February 07, 2015 22:54 > > NOTE: this is happening on Centos 6 x86_64, > 2.6.32-504.3.3.el6.x86_64 not Centos 5 > > Dell PowerEdge 2970, Seagate SATA drive, non-raid. > > I have this server which has been dying randomly, with

[CentOS] Intermittent problem, likely disk IO related - mptscsih: ioc0: attempting task abort!

2015-02-07 Thread Jason Pyeron
NOTE: this is happening on Centos 6 x86_64, 2.6.32-504.3.3.el6.x86_64 not Centos 5 Dell PowerEdge 2970, Seagate SATA drive, non-raid. I have this server which has been dying randomly, with no logs. I had a tail -f over ssh for a week, when this just happened. Feb 8 00:10:21 thirteen-230 kerne