On Fri, Dec 9, 2011 at 1:09 PM, Win Htin <win.h...@gmail.com> wrote:

> >> Hi folks,
> >>
> >> I have a particular process which is very latency sensitive (in the
> >> milliseconds range). The network team has determined the
> >> response/round-trip time between their monitoring device and that
> >> particular process can go up to 4x above the norm. Looking through all
> >> sorts of standard linux commands and output from monitoring tool
> >> (Nagios), I can see the CPU usage never exceeds more than 20% and
> >> app/user memory usage was below 50% during the so called slowness
> >> windows.
> >>
> >> This issue pops up only a handful of times each day and since it is
> >> very transient in nature, it is extremely difficult to determine what
> >> is causing this intermittent slowness. Any idea what sort of tools
> >> might be able to help me pinpoint the issue?
> >>
> >> The users are not complaining about other processes running on the
> >> same server because those other processes are comparatively not that
> >> latency sensitive.
> >>
> >> I am running RHEL 5.7 (64bit) with kernel revision  2.6.18-274.3.1.el5.
>
> > When the process is responding slowly, is it paged out?  Have you tuned
> > vm.swappiness for your server load to reduce the changes of processes
> > getting paged out in favor of disk caching?  The default vm.swappiness
> > may not be ideal for your workload.
>
> With the standard tools I have, it is not possible to actively monitor
> this in real time since the latency issue is only part of the multiple
> stages of the application and is in the milliseconds range. Saying
> that, I don't think it is paged out.
>
> > What about disk I/O?  I suggest you run "iostat -xnd 1" to monitor your
> disk use during this time.
>
> The disk I/O is very little. The app is mostly in memory. Again, even
> if it was disk I/O related, the iostat command most likely will not be
> able to capture it due to the issue lasting less than 30 milliseconds.
>
> The users are expecting e.g. max X millisecond response and if it is
> anything above, it is no good for them. Anyhow, I did a tcpdump and
> the app guys turned on debug mode while the tcpdump was going on. I
> guess we can correlate some useful info out of that and go from there.
>

maybe you would require realtime linux or similar, with network QoS set?
_______________________________________________
rhelv5-list mailing list
rhelv5-list@redhat.com
https://www.redhat.com/mailman/listinfo/rhelv5-list

Reply via email to