Hi Georg,
the Flink implementations seem to be based off of SQL Server[1], which
has similar (though better documented) behavior for integer-like data types.
[1]
https://docs.microsoft.com/en-us/sql/t-sql/functions/avg-transact-sql?view=sql-server-ver15
Best
Ingo
On 24.03.22 16:08, Georg Heiler wrote:
Hi,
I observe strange behavior in Flink SQL:
For an input stream:
CREATE TABLE input_stream (
duration int,
rating int
) WITH (
'connector' = 'kafka',
'topic' = 't',
'scan.startup.mode' = 'earliest-offset',
'format' = 'avro-confluent',
'avro-confluent.schema-registry.url' = 'http://localhost:8081/
<http://localhost:8081/>',
'properties.group.id <http://properties.group.id>' = 'flink-test-001',
'properties.bootstrap.servers' = 'localhost:9092'
);
The following SQL:
SELECT AVG(duration) AS duration_mean, AVG(CAST(rating AS DOUBLE)) AS
rating_mean FROM input_stream;
returns:
duration_mean rating_mean
45 2.503373819163293
I.e. duration_mean is truncated to an INT!
Any other database system I know by default outputs a DOUBLE type for
any input (including INT) and does not truncate it.
Why does Flink decide to truncate here? Why is a manual type cast necessary?
Best,
Georg