On Sat, Apr 08, 2017 at 07:52:54PM -0700, Colin Percival wrote: > On 04/04/17 13:06, Robie Basak wrote: > > I'd like to retrieve and permanently archive (offline) a full set of > > archives stored with one particular key using Tarsnap. > > > > These are of course deduplicated at Tarsnap's end. But if I download > > them one at at time (using something like "tarsnap --list-archives|xargs > > tarsnap -r ..." for example), it'll cost me a ton of bandwidth - both at > > my end which is metered, and in Tarsnap's bandwidth charges. > > > > I'd like my bandwith bill to be the "Compressed size/(unique data)" > > figure from --print-stats, not the "Compressed size/All archives" > > figure. Since the redundancy is there and my client has all the details, > > is there any way I can take advantage of this? > > Not right now. This is something I've been thinking about implementing, > but it's rather complicated (the tarsnap "read" path would need to look at > data on disk to see what it can "reuse", and normally it doesn't read any > files from disk).
In case it helps others, I hacked together a client-side cache for this one task. It appears to have worked. Patch below. This is absolutely a hack and not production ready (no concurrency, bad error handling, hardcoded cache path whose directory must be created in advance and permissions set manually, etc), but for a one-off task it was enough for me to get my data out. --- tar/storage/storage_read.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/tar/storage/storage_read.c b/tar/storage/storage_read.c index 2c19650..62bf6b7 100644 --- a/tar/storage/storage_read.c +++ b/tar/storage/storage_read.c @@ -13,6 +13,7 @@ #include "storage_internal.h" #include "sysendian.h" #include "warnp.h" +#include "hexify.h" #include "storage.h" @@ -313,6 +314,20 @@ storage_read_file(STORAGE_R * S, uint8_t * buf, size_t buflen, } } + int old_errno = errno; + char hashbuf[65]; + hexify(name, hashbuf, 32); + char *cache_path; + if(asprintf(&cache_path, "/tmp/tarsnap-cache/%c-%s", class, hashbuf) < 0) abort(); + FILE *fp = fopen(cache_path, "r"); + if (fp) { + if (fread(buf, buflen, 1, fp) != 1) abort(); + if (fclose(fp)) abort(); + free(cache_path); + return 0; + } else { + errno = old_errno; + } /* Initialize structure. */ C.buf = buf; C.buflen = buflen; @@ -326,6 +341,13 @@ storage_read_file(STORAGE_R * S, uint8_t * buf, size_t buflen, goto err0; done: + if (!C.status) { + FILE *fp = fopen(cache_path, "w"); + if (!fp) abort(); + if(fwrite(buf, buflen, 1, fp) != 1) abort(); + if(fclose(fp)) abort(); + } + free(cache_path); /* Return status code from server. */ return (C.status); -- 2.7.4