[ 
https://issues.apache.org/jira/browse/CASSANDRA-12417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15545361#comment-15545361
 ] 

Sylvain Lebresne commented on CASSANDRA-12417:
----------------------------------------------

bq. Even if the new behavior is probably better than the previous one it is 
still a change of behavior which can surprise some users.

For what it's worth, at least as far as the average function is concerned, I 
think it's a bit of a stretch: I don't see how anyone could rely on an average 
function that return completely broken result for no apparent reason (how the 
function is implemented is the definition of an implementation detail, there is 
no overflow in either the input or the output in this case). Meaning that 
either users haven't realized this can be a problem, and I can guarantee 
they'll be _very_ happy it's fixed, or they did realize and avoided using the 
function in case it would internally overflow, and the change is harmless to 
them. Basically, I genuinely think the current behavior is a bug and not fixing 
it (on 3.0) because it's a change of behavior is akin to say that we should fix 
no bug ever. I don't even think we should particularly call attention to the 
change in {{NEWS.txt}} (it's a bug fix like any other), though it's not like 
I'm going to fight it if you really want to.

The changes to the sum are more arguably more debatable (as far as 3.0 is 
concerned). I kind of doubt anyone would get pissed at us improving the 
precision of the function, so I would be fine going 3.0 here too, but I don't 
mind leaving that part 3.X only if we want to play it extra safe.


> Built-in AVG aggregate is much less useful than it should be
> ------------------------------------------------------------
>
>                 Key: CASSANDRA-12417
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12417
>             Project: Cassandra
>          Issue Type: Bug
>          Components: CQL
>            Reporter: Branimir Lambov
>            Assignee: Alex Petrov
>
> For fixed-size integer types overflow is all but guaranteed to happen, 
> yielding incorrect result. While for sum it is somewhat acceptable as the 
> result cannot fit the type, this is not the case for average.
> As the result of average is always within the scope of the source type, 
> failing to produce it only signifies a bad implementation. Yes, one can solve 
> this by type-casting, but do we really want to always have to be telling 
> people that the correct spelling of the average function is 
> {{cast(avg(cast(value as bigint))) as int)}}, especially if this is so 
> trivial to fix?
> Additionally, the straightforward addition we use for floating point versions 
> is not a good choice numerically for larger numbers of values. We should 
> switch to a more stable version, e.g. iterative mean using {{avg = avg + 
> (value - avg) / count}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to