Hi, I'm another researcher that uses quite a bit of the historical data held in these services, and I appreciate the commitment to keeping this data available where possible.
In the Labs article <https://labs.ripe.net/author/kistel/ripe-ncc-measurement-data-retention-principles/>, there's a statement that: "For the RIPEstat use-case, we make the data available in a variety of ways which takes up about 800 TB of storage space." This reads to me as if there's a lot of (potentially unnecessary?) data duplication. I think proposal 2 therefore sounds sensible - I would imagine that it's possible to reconstruct some of or all of the formats served, so for older data would producing some of these on-the-fly/converting formats be feasible? Is there a way to get a breakdown of what data forms you're using are most storage-intensive, or which parts of services like RIPEstat are using the most storage? I'm imagining that there probably aren't that many use-cases where getting instant access to historic data is needed, so making accessing older data slower/tiered (and hence cheaper) doesn't seem like a problem, but I'm looking at it very much from a research perspective so I could be way off the mark on that. Kind regards, Josh
-- To unsubscribe from this mailing list, get a password reminder, or change your subscription options, please visit: https://lists.ripe.net/mailman/listinfo/mat-wg