Re: [CentOS] Correlate i/o with a process

Dag Wieers Sun, 19 Aug 2007 07:37:23 -0700

On Sat, 18 Aug 2007, Dag Wieers wrote:

> On Fri, 17 Aug 2007, Mag Gam wrote:
> > On 8/17/07, John R Pierce <[EMAIL PROTECTED]> wrote:
> > > Mag Gam wrote:
> > >
> > > > I have a server with 2 HBAs, and the users keeps complaining about
> > > > performance problems. My question is, how can I relate the process
> > > > with high I/O wait? Also, is it possible to see how much data is being
> > > > pushed thru by my 2 HBAs?
> > >
> > > iostat (part of the sysstat package) will answer your 2nd question.
> > >
> > > I dunno how to measure io wait time per process.  maybe IBM's NMON can
> > > do that, not sure, I haven't used it for a while.
> > > http://www-941.haw.ibm.com/collaboration/wiki/display/WikiPtype/nmon
> >
> > Thanks John.
> > 
> > Yes, this is a tricky question, but I face this a lot....Unfortunately, I am
> > not sure how to check the adapter throughput, and what process is causing
> > the i/o wait.
> 
> I believe that recent kernels have a patch applied that show io counters 
> per process. I haven't looked into it yet though.
> 
> This is one of the most important items on my wishlist for dstat, a topio 
> plugin next to the existing topcpu and topmem plugins.


I found the following interesting information while googling. Now I need 
to find a kernel that provides the counters ;-)

Based on this information I will most likely have topio, topio_real and 
topio_ops

        2.14  /proc/<pid>/io - Display the IO accounting fields
        -------------------------------------------------------

        This file contains IO statistics for each running process
        
        Example
        -------
        
        test:/tmp # dd if=/dev/zero of=/tmp/test.dat &
        [1] 3828
        
        test:/tmp # cat /proc/3828/io
        rchar: 323934931
        wchar: 323929600
        syscr: 632687
        syscw: 632675
        read_bytes: 0
        write_bytes: 323932160
        cancelled_write_bytes: 0
        
        
        Description
        -----------
        
        rchar
        -----
        
        I/O counter: chars read
        The number of bytes which this task has caused to be read from storage. 
This
        is simply the sum of bytes which this process passed to read() and 
pread().
        It includes things like tty IO and it is unaffected by whether or not 
actual
        physical disk IO was required (the read might have been satisfied 

        pagecache)
        
        
        wchar
        -----
        
        I/O counter: chars written
        The number of bytes which this task has caused, or shall cause to be 
written
        to disk. Similar caveats apply here as with rchar.
        
        
        syscr
        -----
        
        I/O counter: read syscalls
        Attempt to count the number of read I/O operations, i.e. syscalls like 
read()
        and pread().
        
        
        syscw
        -----
        
        I/O counter: write syscalls
        Attempt to count the number of write I/O operations, i.e. syscalls 

        write() and pwrite().
        
        
        read_bytes
        ----------
        
        I/O counter: bytes read
        Attempt to count the number of bytes which this process really did 
cause to
        be fetched from the storage layer. Done at the submit_bio() level, so 
it is
        accurate for block-backed filesystems. <please add status regarding NFS 
and
        CIFS at a later time>
        
        
        write_bytes
        -----------
        
        I/O counter: bytes written
        Attempt to count the number of bytes which this process caused to be 
sent to
        the storage layer. This is done at page-dirtying time.
        
        
        cancelled_write_bytes
        ---------------------
        
        The big inaccuracy here is truncate. If a process writes 1MB to a file 
and
        then deletes the file, it will in fact perform no writeout. But it will 
have
        been accounted as having caused 1MB of write.
        In other words: The number of bytes which this process caused to not 
happen,
        by truncating pagecache. A task can cause "negative" IO too. If this 
task
        truncates some dirty pagecache, some IO which another task has been 
accounted
        for (in it's write_bytes) will not be happening. We _could_ just 
subtract that
        from the truncating task's write_bytes, but there is information loss 
in doing
        that.
        
        
        Note
        ----
        
        At its current implementation state, this is a bit racy on 32-bit 
machines: if
        process A reads process B's /proc/pid/io while process B is updating 
one of
        those 64-bit counters, process A could see an intermediate result.
        
        
        More information about this can be found within the taskstats 
documentation in
        Documentation/accounting.

--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[Any errors in spelling, tact or fact are transmission errors]
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Correlate i/o with a process

Reply via email to