Bug#930803: new program: runcached

Andras Korn Wed, 26 Jun 2019 03:27:19 -0700

On Tue, Jun 25, 2019 at 08:22:12PM +0200, Nicolas Schier wrote:

Hi,


> > I just wrote this script:
> > https://gist.github.com/akorn/51ee2fe7d36fa139723c851d87e56096 and thought
> > it might be a good addition to moreutils.
> > 
> > It caches the stdout, stderr and exit status of arbitrary commands for a
> > configurable length of time, returning data from cache on subsequent
> > invocations if the cache is still fresh.
> 
> thanks for your suggestion; I think it's quite an interesting idea!
> And you made me curious:  which commands are you running through
> 'runcached'?  All programs I thought of have their basic functionality
> based on side-effects as file system or network access.

Examples:

 * I have a Makefile to regenerate the data.cdb file for tinydns whenever
   any of the local source files changed. Part of the process is downloading
   some remote DNS zones with axfr-get, which is relatively slow. The remote
   zones don't change frequently; I don't want to download them each time,
   but I do want to download them at least once every eight hours or so. So
   I run axfr-get via runcached, with a cache ttl of 8 hours.

 * Some monitoring systems like Zabbix and Munin need to run data-gathering
   stuff that is time-consuming; it sometimes happens that several plugins
   need to run the same thing, but extract different data from its output.
   Instead of having separate, ad-hoc caching in all such plugins, it's
   better to have a generic caching solution.

 * When looking for space hogs in the filesystem, I often use
   'du -hscx * .* | sort -h' and then dig down further. Without runcached
   I would have to either save the output separately, or keep opening new
   sessions in screen(1), or wait for the same output to be generated again,
   when I go up to the higher level directory again. With runcached, `du` is
   cheap the 2nd time, and I don't care if the numbers are slightly off when
   they come from the cache.

 * Same for ad-hoc log analysis sessions: I may grep through a bunch of
   logs, then grep for something else, then grep for the same thing again.
   With runcached, I don't need to worry about saving output I may need
   again, because runcached does it for me.

> > It currently has semi-esoteric dependencies: it's written in zsh and uses
> > chpst from the runit package for locking. If you're willing to include the
> > script I can change it to use flock(1) instead, but I'm not rewriting it in
> > POSIX sh.
> 
> Adding new scripts to the moreutils collection is usually done by
> forwarding to the upstream maintainer (Joey Hess <jo...@joey.name>) and
> asking for script inclusion.  But, as Joey keeps more than just one eye
> on cross platform compatibility, I expect non-POSIX implementations to
> be rejected.  Do you keep your non-POSIX statement?

I modified it to use zsh's system module for locking, but I'm sticking with
zsh; I have no interest in rewriting it in plain Bourne sh. zsh isn't much
less cross-platform than, say, perl or Python.

The --prune-cache functionality probably depends on GNU find(1).

I'm offering the script in the belief that it might be useful to others, but
getting it into moreutils is no priority for me.

> Did you think about the license you want to stick it to? GPL2+?

I was thinking GPLv3+, but if the rest of moreutils is GPL2+, I'm fine with
that too.

András

-- 
        A synonym is a word you use when you can't spell the other one.

Bug#930803: new program: runcached

Reply via email to