On Thursday, 13 August 2020 at 07:08:21 UTC, Jon Degenhardt wrote:
Test                          Elapsed  System   User
----                          -------  ------   ----
tsv-select -f 2,3 FILE          10.28    0.42   9.85
cat FILE | tsv-select -f 2,3    11.10    1.45  10.23
cut -f 2,3 FILE                 14.64    0.60  14.03
cat FILE | cut -f 2,3           14.36    1.03  14.19
wc -l FILE                       1.32    0.39   0.93
cat FILE | wc -l                 1.18    0.96   1.04


The TREE file:

Test                          Elapsed  System   User
----                          -------  ------   ----
tsv-select -f 2,3 FILE           3.77    0.95   2.81
cat FILE | tsv-select -f 2,3     4.54    2.65   3.28
cut -f 2,3 FILE                 17.78    1.53  16.24
cat FILE | cut -f 2,3           16.77    2.64  16.36
wc -l FILE                       1.38    0.91   0.46
cat FILE | wc -l                 2.02    2.63   0.77



Your table shows that when piping the output from one process to another, there's a lot more time spent in kernel mode. A switch from user mode to kernel mode is expensive [1]. It costs around 1000-1500 clock cycles for a call to getpid() on most systems. That's around 100 clock cycles for the actual switch and the rest is overhead.

My theory is this:
One of the reasons for the slowdown is very likely mutex un/locking of which there is more need when multiple processes and (global) resources are involved compared to a single instance.
Another is copying buffers.
When you read a file the data is first read into a kernel buffer which is then copied to the user space buffer i.e. the buffer you allocated in your program (the reading part might not happen if the data is still in the cache). If you read the file directly in your program, the data is copied once from kernel space to user space. When you read from stdin (which is technically a file) it would seem that cat reads the file which means a copy from kernel to user space (cat), then cat outputs that buffer to stdout (also technically a file) which is another copy, then you read from stdin in your program which will cause another copy from stdout to stdin and finally to your allocated buffer.
Each of those steps may invlovle a mutex un/lock.
Also with pipes you start two programs. Starting a program takes a few ms.

PS. If you do your own caching, or if you don't care about it because you just read a file sequentially once, you may benefit from opening your file with the O_DIRECT flag which basically means that the kernel copies directly into user space buffers.

[1] https://en.wikipedia.org/wiki/Ring_(computer_security)

Reply via email to