Re: [CentOS] Weird performance problem

2009-05-24 Thread Dag Wieers

On Thu, 16 Apr 2009, Ugo Bellavance wrote:


Ugo Bellavance a écrit :


I'm running a CentOS 4.  server and I sometimes face a weird problem.
It is a weird performance problem, and here is how I discovered it.

This server runs OpenVZ virtual machines, and one of them is an asterisk
server for my personal use.  The first symptom of the problem is that
the voice quality became flaky.  So I logged on the server to see what
could be eating cpu cycles, when I ran top, it took almost one minute
before top actually showed.  Another hint is that when I run dstat (a
monitoring utility that is a mix of iostat and vmstat and other stats),
I often get a "missed xx ticks", where xx is a number.


Another hint is that pings are really slow.  Even pinging localhost is
very long.  The first reply is fast, but the second takes ages to come.


I am glad that my dstat tool provided you the information about missing 
ticks, because all the other tools were basicly giving you wrong 
statistics.


Dstat's statistics are wrong too, but at least it is providing you with a 
hint you shouldn't trust the numbers. :)


How to fix this for OpenVZ I can't tell, but I am sure the OpenVZ forums 
have smart people with an insight.


If you find that information I am very interested to know what the 
solution is for OpenVZ. (Since you're running their kernel, I would have 
thought this would not have been an issue though)


Thanks for keeping us posted !
--
--   dag wieers,  d...@centos.org,  http://dag.wieers.com/   --
[Any errors in spelling, tact or fact are transmission errors]___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Weird performance problem

2009-04-22 Thread Mike Fedyk
Can you find the first time this problem occoured?  How about trying
older kernel versions?

You are either dealing with misbehaving hardware/driver or you need to
tweak the settings on your clock source.

Believe it or not but good sources of info to fix this affect vmware
also so read those docs.

On 4/20/09, Ugo Bellavance  wrote:
> JohnS a écrit :
>> On Sun, 2009-04-19 at 09:02 -0700, Akemi Yagi wrote:
>>> On Sun, Apr 19, 2009 at 8:34 AM, JohnS  wrote:
>>>
 I don't reacall any one really saying if there was indeed a fix put into
 that PAE or any Specific Kernel for CentOS. Can the CentOS Kernel
 Builder Comment Please?

 Also See: +1
 http://wiki.centos.org/TipsAndTricks/VMWare_Server?highlight=(100hz)

 JohnStanley
>>> As noted at the top of that wiki page, the contents need to be
>>> updated.  When I added that note, I intended to do it asap but have
>>> not had a chance to do so.  However, the link referenced in there (
>>> http://kb.vmware.com/kb/1006427 ) is the best source for timekeeping
>>> at this moment.  In short, CentOS no longer offers 100Hz kernels
>>> because the divider=10 kernel option now works.
>>>
>>> Akemi
>> 
>> Thanks Akemi for the update on it. That should fix hin up,
>
> I'm not using VMWare, I'm using OpenVZ...
>
> Ugo
>
> ___
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
>

-- 
Sent from my mobile device
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Weird performance problem

2009-04-21 Thread JohnS

On Mon, 2009-04-20 at 08:12 -0400, Ugo Bellavance wrote:
> JohnS a écrit :
> > On Sun, 2009-04-19 at 09:02 -0700, Akemi Yagi wrote:
> >> On Sun, Apr 19, 2009 at 8:34 AM, JohnS  wrote:
> >>
> >>> I don't reacall any one really saying if there was indeed a fix put into
> >>> that PAE or any Specific Kernel for CentOS. Can the CentOS Kernel
> >>> Builder Comment Please?
> >>>
> >>> Also See: +1
> >>> http://wiki.centos.org/TipsAndTricks/VMWare_Server?highlight=(100hz)
> >>>
> >>> JohnStanley
> >> As noted at the top of that wiki page, the contents need to be
> >> updated.  When I added that note, I intended to do it asap but have
> >> not had a chance to do so.  However, the link referenced in there (
> >> http://kb.vmware.com/kb/1006427 ) is the best source for timekeeping
> >> at this moment.  In short, CentOS no longer offers 100Hz kernels
> >> because the divider=10 kernel option now works.
> >>
> >> Akemi
> > 
> > Thanks Akemi for the update on it. That should fix hin up,
> 
> I'm not using VMWare, I'm using OpenVZ...
> 
> Ugo
-
Perhaphs persue your ? with OpenVZ. Try the kernel option?

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Weird performance problem

2009-04-20 Thread Ugo Bellavance
JohnS a écrit :
> On Sun, 2009-04-19 at 09:02 -0700, Akemi Yagi wrote:
>> On Sun, Apr 19, 2009 at 8:34 AM, JohnS  wrote:
>>
>>> I don't reacall any one really saying if there was indeed a fix put into
>>> that PAE or any Specific Kernel for CentOS. Can the CentOS Kernel
>>> Builder Comment Please?
>>>
>>> Also See: +1
>>> http://wiki.centos.org/TipsAndTricks/VMWare_Server?highlight=(100hz)
>>>
>>> JohnStanley
>> As noted at the top of that wiki page, the contents need to be
>> updated.  When I added that note, I intended to do it asap but have
>> not had a chance to do so.  However, the link referenced in there (
>> http://kb.vmware.com/kb/1006427 ) is the best source for timekeeping
>> at this moment.  In short, CentOS no longer offers 100Hz kernels
>> because the divider=10 kernel option now works.
>>
>> Akemi
> 
> Thanks Akemi for the update on it. That should fix hin up,

I'm not using VMWare, I'm using OpenVZ...

Ugo

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Weird performance problem

2009-04-19 Thread JohnS

On Sun, 2009-04-19 at 09:02 -0700, Akemi Yagi wrote:
> On Sun, Apr 19, 2009 at 8:34 AM, JohnS  wrote:
> 
> > I don't reacall any one really saying if there was indeed a fix put into
> > that PAE or any Specific Kernel for CentOS. Can the CentOS Kernel
> > Builder Comment Please?
> >
> > Also See: +1
> > http://wiki.centos.org/TipsAndTricks/VMWare_Server?highlight=(100hz)
> >
> > JohnStanley
> 
> As noted at the top of that wiki page, the contents need to be
> updated.  When I added that note, I intended to do it asap but have
> not had a chance to do so.  However, the link referenced in there (
> http://kb.vmware.com/kb/1006427 ) is the best source for timekeeping
> at this moment.  In short, CentOS no longer offers 100Hz kernels
> because the divider=10 kernel option now works.
> 
> Akemi

Thanks Akemi for the update on it. That should fix hin up,

JohnStanley

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Weird performance problem

2009-04-19 Thread Akemi Yagi
On Sun, Apr 19, 2009 at 8:34 AM, JohnS  wrote:

> I don't reacall any one really saying if there was indeed a fix put into
> that PAE or any Specific Kernel for CentOS. Can the CentOS Kernel
> Builder Comment Please?
>
> Also See: +1
> http://wiki.centos.org/TipsAndTricks/VMWare_Server?highlight=(100hz)
>
> JohnStanley

As noted at the top of that wiki page, the contents need to be
updated.  When I added that note, I intended to do it asap but have
not had a chance to do so.  However, the link referenced in there (
http://kb.vmware.com/kb/1006427 ) is the best source for timekeeping
at this moment.  In short, CentOS no longer offers 100Hz kernels
because the divider=10 kernel option now works.

Akemi
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Weird performance problem

2009-04-19 Thread JohnS

On Sat, 2009-04-18 at 15:54 -0400, Ugo Bellavance wrote:
> JohnS wrote:
> 
> > -
> > That's a known problem with the Kernel and VM Kernel. You need the fixed
> > kernel.
> 
> Do you have an URL of the bug or something?
> 
> I updated to the latest kernel.
> 
> I was running 2.6.18-92.1.18.el5.028stab060.2PAE.
> 
> Regards,
---
There are one or two people that have recompiled a kernel to be
compatable with CentOS and it is on there site which they can join in
and give you the site.

Please read:
http://www.vmware.com/pdf/vmware_timekeeping.pdf

I don't reacall any one really saying if there was indeed a fix put into
that PAE or any Specific Kernel for CentOS. Can the CentOS Kernel
Builder Comment Please?

Also See: +1
http://wiki.centos.org/TipsAndTricks/VMWare_Server?highlight=(100hz)

JohnStanley

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Weird performance problem

2009-04-18 Thread Ugo Bellavance
JohnS wrote:

> -
> That's a known problem with the Kernel and VM Kernel. You need the fixed
> kernel.

Do you have an URL of the bug or something?

I updated to the latest kernel.

I was running 2.6.18-92.1.18.el5.028stab060.2PAE.

Regards,

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Weird performance problem

2009-04-17 Thread JohnS

On Thu, 2009-04-16 at 09:12 -0400, Ugo Bellavance wrote:
> Hi,
> 
> I'm running a CentOS 4.  server and I sometimes face a weird problem. 
> It is a weird performance problem, and here is how I discovered it.
> 
> This server runs OpenVZ virtual machines, and one of them is an asterisk 
> server for my personal use.  The first symptom of the problem is that 
> the voice quality became flaky.  So I logged on the server to see what 
> could be eating cpu cycles, when I ran top, it took almost one minute 
> before top actually showed.  Another hint is that when I run dstat (a 
> monitoring utility that is a mix of iostat and vmstat and other stats), 
> I often get a "missed xx ticks", where xx is a number.
> 
> Example (current) (sorry for the wrap):
> 
> total-cpu-usage -dsk/total- -net/total- ---paging-- ---system--
> usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw
>3   2  93   2   0   0| 106k  273k|   0 0 | 0.2   0.4 |1039   389
>3   6  91   0   0   0|   0  6416k| 276k  275k|   0 0 |2160  6822 
>   missed 55 ticks
>4  10  84   2   0   0|1200k 1992k|  82k   93k|   0 0 |1188  6275 
>   missed 29 ticks
>1   0  99   0   0   0|   0  1312k|  65k   66k|   0 0 |1050  1114 
>   missed 38 ticks
>2   1  96   0   0   0|   0  1168k|  57k   59k|   0 0 | 491   877 
>   missed 13 ticks
>3   1  94   1   0   0|   0  6016k| 181k  176k|   0 0 |2169  5996 
>   missed 50 ticks
>4   2  91   1   0   0|  28k 8744k| 216k  214k|   0 0 |2159  5438 
>   missed 37 ticks
>1   1  98   0   0   0|   0  2632k|  93k   91k|   0 0 | 983  1381 
>   missed 34 ticks
>1   1  98   1   0   0|   0  5624k| 113k  110k|   0 0 |1569  2643 
>   missed 52 ticks
>1   1  98   1   0   0|   0  2432k|  29k   28k|   0 0 | 679   647 
>   missed 12 ticks
>0   0 100   0   0   0|   0 0 |  60B  374B|   0 0 |  1315
>2   3  94   0   0   0|   0  1872k| 209k  210k|   0 0 |1375  3590 
>   missed 30 ticks
> 
> 
> 
> The problem is currently occuring, but it doesn't seem to be affecting 
> voice quality for now, so I have some time to try to find the cause. 
> The only solution I've found up to now is to reboot... But hey, this 
> isn't a Windows 98 machine :)!
> 
> I tried restarting the VZ system, which restarts all the VMs, but it 
> didn't solve the problem.  I can't tell if the problem occurs on a stock 
> centos kernel, because the server is running production (but 
> non-critical) virtual machines, so it is always running the openVZ kernel.
> 
> So here is what I've done for now:
> 
> - Top shows a load of about 0.4
> 
> - vmstat 1 10 shows this:
> 
> procs ---memory-- ---swap-- -io --system-- 
> cpu
>   r  b   swpd   free   buff  cache   si   sobibo   incs us 
> sy id wa
>   0  0592 191092 381720 5379560053684 3  3 
> 2 93  2
>   0  0592 190720 381720 53795600 0 0   3260  1 
> 1 98  0
>   0  0592 191092 381720 53795600 0 0   4159  0 
> 0 100  0
>   1  0592 191092 381728 53794800 0  2584  31196 10 
> 4 66 19
>   0  0592 189968 381732 53794400 0  2080  222   174  2 
> 3 79 16
>   0  1592 189968 381732 53794400 0  3244  17073 10 
> 4 73 12
>   0  0592 190216 381732 53794400 0   136   76   113  1 
> 2 93  4
>   0  0592 189844 381732 53794400 0 0   3369  1 
> 1 98  0
>   0  0592 189844 381732 53794400 0 0   2432  0 
> 0 100  0
>   0  0592 190340 381732 53794400 0 0   2842  0 
> 0 100  0
> 
> iostat -x 1 (excerpt)
> 
> Device:rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/srkB/swkB/s 
> avgrq-sz avgqu-sz   await  svctm  %util
> sda  0.00 171.00  0.00 124.000.00 2368.00 0.00  1184.00 
> 19.10 0.141.13   0.02   0.20
> sdb  0.00   0.00  0.00  0.000.000.00 0.00 0.00 
> 0.00 0.000.00   0.00   0.00
> sdc  0.00 171.00  0.00 124.000.00 2368.00 0.00  1184.00 
> 19.10 0.171.35   0.02   0.30
> sdd  0.00   0.00  0.00  0.000.000.00 0.00 0.00 
> 0.00 0.000.00   0.00   0.00
> md0  0.00   0.00  0.00  0.000.000.00 0.00 0.00 
> 0.00 0.000.00   0.00   0.00
> md2  0.00   0.00  0.00  0.000.000.00 0.00 0.00 
> 0.00 0.000.00   0.00   0.00
> md1  0.00   0.00  0.00 294.000.00 2352.00 0.00  1176.00 
>  8.00 0.000.00   0.00   0.00
> dm-0 0.00   0.00  0.00  0.000.000.00 0.00 0.00 
> 0.00 0.000.00   0.00   0.00
> dm-1 0.00   0.00  0.00  0.000.000.00 0.00 0.00 
> 0.00 0.000.00   0.00   0.00
> dm-2 0.00   0.00  0.00 294.000.00 2352.00 0.00  1176.00 
>  8.00 0.301.01   0.02   0.50
> d

Re: [CentOS] Weird performance problem

2009-04-16 Thread Ugo Bellavance
Ugo Bellavance a écrit :
> Hi,
> 
> I'm running a CentOS 4.  server and I sometimes face a weird problem. 
> It is a weird performance problem, and here is how I discovered it.
> 
> This server runs OpenVZ virtual machines, and one of them is an asterisk 
> server for my personal use.  The first symptom of the problem is that 
> the voice quality became flaky.  So I logged on the server to see what 
> could be eating cpu cycles, when I ran top, it took almost one minute 
> before top actually showed.  Another hint is that when I run dstat (a 
> monitoring utility that is a mix of iostat and vmstat and other stats), 
> I often get a "missed xx ticks", where xx is a number.

Another hint is that pings are really slow.  Even pinging localhost is 
very long.  The first reply is fast, but the second takes ages to come.

It seems to be blocking here:

recvmsg(3, 0xbfbf84b0, MSG_DONTWAIT)= -1 EAGAIN (Resource 
temporarily unavailable)
gettimeofday({1239887784, 389347}, NULL) = 0
poll(

The rest comes as soon as there is another response:

[{fd=3, events=POLLIN|POLLERR}], 1, 999) = 0
gettimeofday({1239887903, 119727}, NULL) = 0
gettimeofday({1239887903, 119791}, NULL) = 0
sendmsg(3, {msg_name(16)={sa_family=AF_INET, sin_port=htons(0), 
sin_addr=inet_addr("127.0.0.1")}, 
msg_iov(1)=[{"\10\0\335\2018)\0\4\0370\347I\357\323\1\0\10\t\n\v\f\r\16\17\20\21\22\23\24\25\26\27"...,
 
64}], msg_controllen=0, msg_flags=0}, MSG_CONFIRM) = 64
recvmsg(3, {msg_name(16)={sa_family=AF_INET, sin_port=htons(0), 
sin_addr=inet_addr("127.0.0.1")}, 
msg_iov(1)=[{"e\0\0t\26\264\...@\1e\363\177\0\0\1\177\0\0\1\0\0\345\2018)\0\4\0370\347I"...,
 
192}], msg_controllen=20, {cmsg_len=20, cmsg_level=SOL_SOCKET, 
cmsg_type=0x1d /* SCM_??? */, ...}, msg_flags=0}, 0) = 84
write(1, "64 bytes from hn01.domain"..., 82) = 82
recvmsg(3, 0xbfbf84b0, MSG_DONTWAIT)= -1 EAGAIN (Resource 
temporarily unavailable)
gettimeofday({1239887903, 120785}, NULL) = 0
poll(

Then it blocks again...

This confuses Nagios that is running in a VM on this server.

Can the 'gettimeofday' be the problem?  'date' runs w/o delay.

Thanks,

Ugo

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


[CentOS] Weird performance problem

2009-04-16 Thread Ugo Bellavance
Hi,

I'm running a CentOS 4.  server and I sometimes face a weird problem. 
It is a weird performance problem, and here is how I discovered it.

This server runs OpenVZ virtual machines, and one of them is an asterisk 
server for my personal use.  The first symptom of the problem is that 
the voice quality became flaky.  So I logged on the server to see what 
could be eating cpu cycles, when I ran top, it took almost one minute 
before top actually showed.  Another hint is that when I run dstat (a 
monitoring utility that is a mix of iostat and vmstat and other stats), 
I often get a "missed xx ticks", where xx is a number.

Example (current) (sorry for the wrap):

total-cpu-usage -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw
   3   2  93   2   0   0| 106k  273k|   0 0 | 0.2   0.4 |1039   389
   3   6  91   0   0   0|   0  6416k| 276k  275k|   0 0 |2160  6822 
  missed 55 ticks
   4  10  84   2   0   0|1200k 1992k|  82k   93k|   0 0 |1188  6275 
  missed 29 ticks
   1   0  99   0   0   0|   0  1312k|  65k   66k|   0 0 |1050  1114 
  missed 38 ticks
   2   1  96   0   0   0|   0  1168k|  57k   59k|   0 0 | 491   877 
  missed 13 ticks
   3   1  94   1   0   0|   0  6016k| 181k  176k|   0 0 |2169  5996 
  missed 50 ticks
   4   2  91   1   0   0|  28k 8744k| 216k  214k|   0 0 |2159  5438 
  missed 37 ticks
   1   1  98   0   0   0|   0  2632k|  93k   91k|   0 0 | 983  1381 
  missed 34 ticks
   1   1  98   1   0   0|   0  5624k| 113k  110k|   0 0 |1569  2643 
  missed 52 ticks
   1   1  98   1   0   0|   0  2432k|  29k   28k|   0 0 | 679   647 
  missed 12 ticks
   0   0 100   0   0   0|   0 0 |  60B  374B|   0 0 |  1315
   2   3  94   0   0   0|   0  1872k| 209k  210k|   0 0 |1375  3590 
  missed 30 ticks



The problem is currently occuring, but it doesn't seem to be affecting 
voice quality for now, so I have some time to try to find the cause. 
The only solution I've found up to now is to reboot... But hey, this 
isn't a Windows 98 machine :)!

I tried restarting the VZ system, which restarts all the VMs, but it 
didn't solve the problem.  I can't tell if the problem occurs on a stock 
centos kernel, because the server is running production (but 
non-critical) virtual machines, so it is always running the openVZ kernel.

So here is what I've done for now:

- Top shows a load of about 0.4

- vmstat 1 10 shows this:

procs ---memory-- ---swap-- -io --system-- 
cpu
  r  b   swpd   free   buff  cache   si   sobibo   incs us 
sy id wa
  0  0592 191092 381720 5379560053684 3  3 
2 93  2
  0  0592 190720 381720 53795600 0 0   3260  1 
1 98  0
  0  0592 191092 381720 53795600 0 0   4159  0 
0 100  0
  1  0592 191092 381728 53794800 0  2584  31196 10 
4 66 19
  0  0592 189968 381732 53794400 0  2080  222   174  2 
3 79 16
  0  1592 189968 381732 53794400 0  3244  17073 10 
4 73 12
  0  0592 190216 381732 53794400 0   136   76   113  1 
2 93  4
  0  0592 189844 381732 53794400 0 0   3369  1 
1 98  0
  0  0592 189844 381732 53794400 0 0   2432  0 
0 100  0
  0  0592 190340 381732 53794400 0 0   2842  0 
0 100  0

iostat -x 1 (excerpt)

Device:rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/srkB/swkB/s 
avgrq-sz avgqu-sz   await  svctm  %util
sda  0.00 171.00  0.00 124.000.00 2368.00 0.00  1184.00 
19.10 0.141.13   0.02   0.20
sdb  0.00   0.00  0.00  0.000.000.00 0.00 0.00 
0.00 0.000.00   0.00   0.00
sdc  0.00 171.00  0.00 124.000.00 2368.00 0.00  1184.00 
19.10 0.171.35   0.02   0.30
sdd  0.00   0.00  0.00  0.000.000.00 0.00 0.00 
0.00 0.000.00   0.00   0.00
md0  0.00   0.00  0.00  0.000.000.00 0.00 0.00 
0.00 0.000.00   0.00   0.00
md2  0.00   0.00  0.00  0.000.000.00 0.00 0.00 
0.00 0.000.00   0.00   0.00
md1  0.00   0.00  0.00 294.000.00 2352.00 0.00  1176.00 
 8.00 0.000.00   0.00   0.00
dm-0 0.00   0.00  0.00  0.000.000.00 0.00 0.00 
0.00 0.000.00   0.00   0.00
dm-1 0.00   0.00  0.00  0.000.000.00 0.00 0.00 
0.00 0.000.00   0.00   0.00
dm-2 0.00   0.00  0.00 294.000.00 2352.00 0.00  1176.00 
 8.00 0.301.01   0.02   0.50
dm-3 0.00   0.00  0.00 294.000.00 2352.00 0.00  1176.00 
 8.00 0.301.01   0.02   0.50
dm-4 0.00   0.00  0.00  0.000.000.00 0.00 0.00 
0.00 0.000.00   0.00   0.00
dm-5 0.00   0.00  0.00  0.000.000