About effective resolution of cpu execution clocks

2012-10-24 Thread Miguel Telleria de Esteban
Dear all,

I resend this question that I posted yesterday in the Linux Real-Time
mailing list but received no answers maybe due to its newbie status.

http://marc.info/?l=linux-rt-usersm=135099100016609w=2

let me rephrase-it for a generic not necessary real-time aware audience:

The question concerns the per-process CPU usage statistics maintained by
the kernel.  As far as I can tell, the only places where this usage
counter is stored are in the utime and stime fields of task_struct.

http://lxr.linux.no/#linux+v3.6.3/include/linux/sched.h#L1362 
(line 1362)

I have observed that these fields are of type cputime_t which seems
to be defined as an unsigned long and therefore contain 32bits (at
least in a 32bit architecture such as x86).

http://lxr.linux.no/linux+v3.6.3/include/asm-generic/cputime.h#L7

These fields utime and stime are used as accumulators of time usage in
the implementation of POSIX CPU-usage clocks and timers.

http://lxr.linux.no/#linux+v3.6.3/kernel/posix-cpu-timers.c

A typical use-case of this functionality is measuring the CPU time
consumed by a thread.  In real-time systems this information can be
used for further actions such as changing its priority, sending a
signal or whatever.

Here is an example using NPTL from libc:

clockid_t clock;
struct timespec before_ts, after_ts, interval;

pthread_getcpuclockid( pthread_self(), clock );
clock_gettime(clock, before_ts);

... do your things here

clock_gettime(clock, after_ts);

interval = timespec_substract(after_ts, before_ts);

In this code the time is stored in a struct timespec which is composed
of 2 32-bit longs obtaining both a resolution of nanoseconds and a
expand of years. 

On the other side 32bit integers such as utime and stime cannot provide
both a high resolution and high time span.  And according to the man
page of proc, when these fields are output from /proc/pid/stat they
give the value in jiffies (1/CONFIG_HZ sg, i.e. 4 millisec in most
kernel configs).

The way clock_gettime works, when linked to a process CPU clock is by
keeping a counter of CPU usage updated by the scheduler on every
preemption action + using hardware facilities to measure the latest
time period.

I assume that Linux, specially since the merge of high
resolution timers in 2.6.21, benefits now from the latest hardware
facilities for time management gaining resolutions of micro and
nanoseconds, as reported by clock_getres().

With this background in mind I repeat the same questions that I asked
in the linux-rt mailing list:

*  What is the effective resolution of two invocations of
   clock_gettime() on the same running thread for a long period
   involving several CPU preemptions?

*  Are there other fields apart from stime and utime with the
   sufficient precision to maintain a CPU usage count?

*  Does the PREEMPT_RT branch improve this resolution somehow?

Thanks in advance for your time.
Cheers,

   Miguel Telleria

-- 

---
  Miguel TELLERIA DE ESTEBANGrupo de computadores y tiempo real
  telleriam ENSAIMADA unican.es Dept. Electrónica y Computadores
   (change ENSAIMADA for @)   Universidad de Cantabria

  http://www.ctr.unican.es  Tel trabajo: +34 942 201477
---



signature.asc
Description: PGP signature
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: buffer page concepts in the page cache

2011-01-25 Thread Miguel Telleria de Esteban
Thanks Mulyadi,

On Wed, 26 Jan 2011 01:19:52 +0700 Mulyadi Santosa wrote:

 Hi Miguel...
 
 Tough questions, let's see if I can made it :D
 
 On Tue, Jan 25, 2011 at 19:56, Miguel Telleria de Esteban
 mig...@mtelleria.com wrote:
  MY INTERPRETATION (please correct me if I am wrong)
 
  Q1  What is a buffer page?
 
  A buffer page is a struct page data describing a page allocated
  to hold one or more i/o blocks from disk.
 
 I agree...in other word, they are pages that hold data when the I/O
 are still in flight. But since it's part of page cache, they aren't
 thrown away after the I/O is done...for few moment they are held in
 RAM, in case they're subsequently read...thus, I/O frequency toward
 physical discs are reduced
 
 I think, we know call it page cache
 
  Q2  Is the whole page cache content organized as buffer pages?
 
  YES, there is no other way to link memory-mapped disk i/o data to
  the struct page pointed by address_space radix-tree entries.
 
 Not so sure, but it's something like that IMHO.
 
  ---
 
  Q3  block device buffer_pages vs file buffer_pages
 
  This I really don't understand.  From what UTLK page 614 says:
 
  *  File buffer_pages ONLY refer to non-contiguous (on disk layout)
  file contents.
 
  *  blockdev buffer_pages refer to single-block or continuous (on
  disk layout) portions of block.
 
  My question is:  what happens with non-fragmented medium size files
  that do not contain disk holes or non-adjancent block submissions?
 
 Here's my understanding:
 1. when you're dealing with file in raw, e.g using dd on /dev/sda1
 or dd with direct I/O command, you use block buffer cache

 2. when you deal with files using read()/write facility of filesystem
 (thus via VFS), you use file page cache...

This makes sense.  Looking through LXR at the do_generic_file_read()
function (actually do_generic_mapping_read() ), the address_space used
is the one of the file, not the dev.

Maybe dd goes also through this same path since you directly specify
the devfile to read from.

The other read path (bread() function) seems to be used when looking
for metadata (inode, superblocks) which are not requested by the
user-space read() call.


 
 to experiment with it, simply start top and examine which field
 increases when you do dd, cat, etc

Uhhmm I don't have this clear.  I would like to check on which
adress_space object I am using (the block device or the file) so I
guess I need more deep tools (maybe ftrace??) to see it.

 
 I hope I help you instead confusing you :D
 

Thanks, you have helped.  On my side I continue (re)reading :).



-- 

  (O-O)
---oOO-(_)-OOo-
 Miguel TELLERIA DE ESTEBAN   http://www.mtelleria.com
 Email: miguel at mtelleria.com   Tel GSM:  +34 650 801098
  Tel Fix:  +34 942 280174

 Miembro de http://www.linuca.orgMembre du http://www.bxlug.be
 ¿Usuario captivo o libre?http://www.obtengalinux.org/windows/
 Free or  captive user?http://www.getgnulinux.org/windows/
---



signature.asc
Description: PGP signature
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies