On Sunday, 27 October, 2019 13:30, Richard Damon <rich...@damon-family.org> 
wrote:

>On 10/27/19 10:24 AM, Simon Slavin wrote:

>> On 27 Oct 2019, at 9:12am, Thomas Kurz <sqlite.2...@t-net.ruhr> wrote:

>>> the discussion standard deviation has shown that at least STDEV and
>>> POWER would be very helpful if they part of SQLite core.

>> These are presentation issues.  Not database issues.   The results of
>> such calculations are unlikely to be used to decide on the continuation
>> of a SQL statement.

> Doing a group by summary query to get the mean and stdev of a sample in
> the database, and then using those to construct a where clause to
> get/exclude data points that are outliers would seem to be a very
> reasonable sort of query that is really a database issue. This could
> normally be done as two distinct queries (maybe within a read
> transaction so we get consistent data) and the stdev and mean built by
> building a count of records, and the sum of the value and value**2, and
> computing the mean and stdev from that in code, as was shown earlier in
> the thread, but having STDEV as a built-in summary function could be
> useful.

Defining an aggregate function that computes the standard deviation (whether of 
the sample or the population) depending of the values fed into the aggregate is 
rather trivial to do, if you need to do that.  That there are some programming 
languages and platforms that do not permit you to do this "easily" speaks more 
to the deficiencies of those programming languages or platforms than it does to 
SQLite3.  You still have to decide whether you want to do it with inaccurate 
arithmetic methods that may be compatible with how window functions work, or 
using more accurate numerical methods that are not compatible with how window 
functions work, or perhaps by implementing something that is both at the same 
time (and uses more memory and cpu (complexity) in order to allow utilization 
of better numerical methods yet still maintain window function compatibity).

Or you could simply execute a query to get the samples you want and do the 
calculation in your code, then pass that "answer" back as a parameter to your 
subsequent SQL that needs that value.  The need to write "one long monster SQL 
statement to do everything all at once" is usually adopted because one is used 
to using client/server systems where such things take excessive amounts of time 
due to client/server turnarounds.

The point is that how you accomplish this is up to you, and however you decide 
to do it is up to you and your requirements.  If SQLite3 had a built-in stdev 
aggregate/window function, you can be sure that someone would determine that 
the built-in was insufficient or overkill to their requirement and would end up 
overriding that function with one which they, in their sole discretion, thought 
was a better fit to their requirement.  This would end up bloating the size of 
the engine for no gain -- or adding yet another configuration parameter to 
exclude the extraneous built-in.  Multiply this by the thousands of possible 
functions and aggregates and you get an idea of the complexity involved.

As an example, I always compile in a proper version of the ROUND() function 
because the builtin function does elementary school 4/5 rounding and not 
half-even rounding.  Even the builtin SUM and TOTAL are rather simplistic 
implementations that can be wildly inaccurate in certain circumstances.  
However, it appears that most "coders" only graduated elementary school and 
would be confused by proper rounding, just as they do not know how binary 
digital computers work and are thus confused by floating-point numbers not 
being the same as they learned in elementary school.

I also always override a pile of the builtin datetime functions to overcome 
some often encountered shortcomings there as well (though that one does require 
a minor modification to the as-distributed code to add some more information 
that would otherwise be lost to the internal DateTime struct, and to the 
parsing routines in order to maintain that extra data).

This can be done because SQLite3 has chosen extensibility over "Do as I have 
Ordained, and If You Do Not Like That Which I Have Ordained, then Too Bad For 
You" route used by most other RDBMS engines.  And even there, on the few other 
RDBMS where such overrides of internal functionality or addition of "extra" 
functionality is available, it is almost always far more complicated (almost, 
becase there are a few exceptions), and often with poorer results overall (such 
as added complication simply for complication sake or slower performance or, in 
many cases, both).

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.




_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to