On 7/30/19 5:58 PM, Alan Gauld via Tutor wrote: > On 30/07/2019 17:21, boB Stepp wrote: > >> musings I am wondering about -- in general -- whether it is best to >> store calculated data values in a file and reload these values, or >> whether to recalculate such data upon each new run of a program. > > It depends on the use case. > > For example a long running server process may not care about startup > delays because it only starts once (or at least very rarely) so either > approach would do but saving diskspace may be helpful so calculate the > values. > > On the other hand a data batch processor running once as part of a > chain working with high data volumes probably needs to start quickly. > In which case do the calculations take longer than reading the > extra data? Probably, so store in a file. > > There are other options too such as calculating the value every > time it is used - only useful if the data might change > dynamically during the program execution. > > It all depends on how much data?, how often it is used?, > how often would it be calculated? How long does the process > run for? etc.
Hey, boB - I bet you *knew* the answer was going to be "it depends" :) There are two very common classes of application that have to make this very decision - real databases, and their toy cousins, spreadsheets. In the relational database world - characterized by very long-running processes (like: unless it crashes, runs until reboot. and maybe even beyond that - if you have a multi-mode replicated or distributed DB it may survive failure of one point) - if a field is calculated it's not stored. Because - what Alan said: in an RDBMS, data are _expected_ to change during runtime. And then for performance reasons, there may be some cases where it's precomputed and stored to avoid huge delays when the computation is expensive. That world even has a term for that: a materialized view (in contrast to a regular view). It can get pretty tricky, you need something that causes the materialized view to update when data has changed; for databases that don't natively support the behavior you then have to fiddle with triggers and hopefully it works out. More enlightened now? _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor