On Tue, Jul 30, 2019 at 7:26 PM Mats Wichmann <m...@wichmann.us> wrote: > > On 7/30/19 5:58 PM, Alan Gauld via Tutor wrote: > > On 30/07/2019 17:21, boB Stepp wrote: > > > >> musings I am wondering about -- in general -- whether it is best to > >> store calculated data values in a file and reload these values, or > >> whether to recalculate such data upon each new run of a program. > > > > It depends on the use case. > > > > For example a long running server process may not care about startup > > delays because it only starts once (or at least very rarely) so either > > approach would do but saving diskspace may be helpful so calculate the > > values. > > > > On the other hand a data batch processor running once as part of a > > chain working with high data volumes probably needs to start quickly. > > In which case do the calculations take longer than reading the > > extra data? Probably, so store in a file. > > > > There are other options too such as calculating the value every > > time it is used - only useful if the data might change > > dynamically during the program execution. > > > > It all depends on how much data?, how often it is used?, > > how often would it be calculated? How long does the process > > run for? etc. > > > Hey, boB - I bet you *knew* the answer was going to be "it depends" :)
You are coming to know me all too well! ~(:>)) I just wanted to check with the professionals here if my thinking (Concealed behind the asked questions.) was correct or, if not, where I am off. > There are two very common classes of application that have to make this > very decision - real databases, and their toy cousins, spreadsheets. > > In the relational database world - characterized by very long-running > processes (like: unless it crashes, runs until reboot. and maybe even > beyond that - if you have a multi-mode replicated or distributed DB it > may survive failure of one point) - if a field is calculated it's not > stored. Because - what Alan said: in an RDBMS, data are _expected_ to > change during runtime. And then for performance reasons, there may be > some cases where it's precomputed and stored to avoid huge delays when > the computation is expensive. That world even has a term for that: a > materialized view (in contrast to a regular view). It can get pretty > tricky, you need something that causes the materialized view to update > when data has changed; for databases that don't natively support the > behavior you then have to fiddle with triggers and hopefully it works > out. More enlightened now? Not more enlightened, perhaps, but more convinced than ever on how difficult it is to manage the complexity of real world programs. -- boB _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor