On Sunday, 27 October, 2019 13:30, Richard Damon <rich...@damon-family.org> wrote:
>On 10/27/19 10:24 AM, Simon Slavin wrote: >> On 27 Oct 2019, at 9:12am, Thomas Kurz <sqlite.2...@t-net.ruhr> wrote: >>> the discussion standard deviation has shown that at least STDEV and >>> POWER would be very helpful if they part of SQLite core. >> These are presentation issues. Not database issues. The results of >> such calculations are unlikely to be used to decide on the continuation >> of a SQL statement. > Doing a group by summary query to get the mean and stdev of a sample in > the database, and then using those to construct a where clause to > get/exclude data points that are outliers would seem to be a very > reasonable sort of query that is really a database issue. This could > normally be done as two distinct queries (maybe within a read > transaction so we get consistent data) and the stdev and mean built by > building a count of records, and the sum of the value and value**2, and > computing the mean and stdev from that in code, as was shown earlier in > the thread, but having STDEV as a built-in summary function could be > useful. Defining an aggregate function that computes the standard deviation (whether of the sample or the population) depending of the values fed into the aggregate is rather trivial to do, if you need to do that. That there are some programming languages and platforms that do not permit you to do this "easily" speaks more to the deficiencies of those programming languages or platforms than it does to SQLite3. You still have to decide whether you want to do it with inaccurate arithmetic methods that may be compatible with how window functions work, or using more accurate numerical methods that are not compatible with how window functions work, or perhaps by implementing something that is both at the same time (and uses more memory and cpu (complexity) in order to allow utilization of better numerical methods yet still maintain window function compatibity). Or you could simply execute a query to get the samples you want and do the calculation in your code, then pass that "answer" back as a parameter to your subsequent SQL that needs that value. The need to write "one long monster SQL statement to do everything all at once" is usually adopted because one is used to using client/server systems where such things take excessive amounts of time due to client/server turnarounds. The point is that how you accomplish this is up to you, and however you decide to do it is up to you and your requirements. If SQLite3 had a built-in stdev aggregate/window function, you can be sure that someone would determine that the built-in was insufficient or overkill to their requirement and would end up overriding that function with one which they, in their sole discretion, thought was a better fit to their requirement. This would end up bloating the size of the engine for no gain -- or adding yet another configuration parameter to exclude the extraneous built-in. Multiply this by the thousands of possible functions and aggregates and you get an idea of the complexity involved. As an example, I always compile in a proper version of the ROUND() function because the builtin function does elementary school 4/5 rounding and not half-even rounding. Even the builtin SUM and TOTAL are rather simplistic implementations that can be wildly inaccurate in certain circumstances. However, it appears that most "coders" only graduated elementary school and would be confused by proper rounding, just as they do not know how binary digital computers work and are thus confused by floating-point numbers not being the same as they learned in elementary school. I also always override a pile of the builtin datetime functions to overcome some often encountered shortcomings there as well (though that one does require a minor modification to the as-distributed code to add some more information that would otherwise be lost to the internal DateTime struct, and to the parsing routines in order to maintain that extra data). This can be done because SQLite3 has chosen extensibility over "Do as I have Ordained, and If You Do Not Like That Which I Have Ordained, then Too Bad For You" route used by most other RDBMS engines. And even there, on the few other RDBMS where such overrides of internal functionality or addition of "extra" functionality is available, it is almost always far more complicated (almost, becase there are a few exceptions), and often with poorer results overall (such as added complication simply for complication sake or slower performance or, in many cases, both). -- The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume. _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users