On 13/10/2024 05:56, Masatake YAMATO wrote:
When copying files, the system data cache are consumed, the system data cache is utilized for both the source and destination files. In scenarios such as creating backup files for old, unused files, it is clear to users that these files will not be needed in the near future. In such cases, retaining the data for these files in the cache constitutes a waste of computer resources, especially when running applications that require significant memory in the foreground.With the new option, users will have the ability to request the discarding of the system data cache, thereby avoiding the unwanted swapping out of data from foreground processes. I evaluated cache consumption using a script called run.bash. Initially, run.bash creates many small files, each 8 KB in size. It then copies these files using the cp command, both with and without the specified option. Finally, it reports the difference in the total size of the caches before and after the copying process. run.bash: #!/bin/bash CP=$1 shift [[ -e "$CP" ]] || { echo "no file found: $CP" 1>&2 exit 1 } N=8 S=drop-src D=${HOME}/drop-dst mkdir -p $S mkdir -p $D start= end= print_cached() { grep ^Cached: /proc/meminfo } start() { start=$(print_cached | awk '{print $2}') } end() { end=$(print_cached | awk '{print $2}') } report() { echo -n "delta[$N:$1/$2]: " expr "$end" - "$start" } cleanup() { local i local j for ((i = 0; i < 10; i++)); do for ((j = 0; j < 10; j++)); do rm -f $S/F-${i}${j}* rm -f $D/F-${i}${j}* done done rm -f $S/F-* rm -f $D/F-* } prep() { local i for ((i = 0; i < 1024 * $N; i++ )); do if ! dd if=/dev/zero of=$S/F-$i bs=4096 count=2 \ status=none; then echo "failed in dd of=$S/F-$F" 1>&2 exit 1 fi done sync } run_cp() { start local i time for ((i = 0; i < 1024 * $N; i++ )); do if ! "${CP}" "$@" "$S/F-$i" "$D/F-$i"; then echo "failed in cp " "$@" "$S/F-$i" " $D/F-$i" 1>&2 exit 1 fi done end report "$1" $2 } cleanup sync prep run_cp "$@" running: ~/coreutils/nocache$ ./run.bash ../src/cp real 0m16.051s user 0m4.249s sys 0m12.437s delta[8:/]: 65548 ~/coreutils/nocache$ ./run.bash ../src/cp --nocache-source real 0m17.109s user 0m4.492s sys 0m13.317s delta[8:--nocache-source/]: 620 --nocache-source option suppresses the consumption of the cache massively.
Thanks for the patch. I have some reservations/notes though... There is nothing particularly special about cp, that it might need this option. I.e. it would be nice to be able to wrap any program so that it streamed data through the cache, rather than aggressively cached. I'm not sure how to do that, but also I'd be reluctant to start adding such options to individual commands though. Perhaps Linux' open() may gain an O_STREAM flag in future that might be more generally applied with a wrapper or something. For single (large) files, one already has this functionality in dd. On the write side, you'd also have to worry about syncing, to make the drop cache advisory effective, and this could impact performance. Might this drop caches for already cached files, which cp may just happen to be copying, thus potentially impacting performance for other programs. If reflinking we probably would not want to do this operation, since we're not reading the source. thanks, Pádraig
