Yunfeng Zhou created FLINK-30959:
------------------------------------
Summary: UNIX_TIMESTAMP's return value does not meet expected
Key: FLINK-30959
URL: https://issues.apache.org/jira/browse/FLINK-30959
Project: Flink
Issue Type: Bug
Components: Table SQL / API
Affects Versions: 1.15.2
Reporter: Yunfeng Zhou
When running the following pyflink program
{code:python}
import pandas as pd
from pyflink.datastream import StreamExecutionEnvironment, HashMapStateBackend
from pyflink.table import StreamTableEnvironment
if __name__ == "__main__":
input_data = pd.DataFrame(
[
["Alex", 100.0, "2022-01-01 08:00:00.001 +0800"],
["Emma", 400.0, "2022-01-01 00:00:00.003 +0000"],
["Alex", 200.0, "2022-01-01 08:00:00.005 +0800"],
["Emma", 300.0, "2022-01-01 00:00:00.007 +0000"],
["Jack", 500.0, "2022-01-01 08:00:00.009 +0800"],
["Alex", 450.0, "2022-01-01 00:00:00.011 +0000"],
],
columns=["name", "avg_cost", "time"],
)
env = StreamExecutionEnvironment.get_execution_environment()
env.set_state_backend(HashMapStateBackend())
t_env = StreamTableEnvironment.create(env)
input_table = t_env.from_pandas(input_data)
t_env.create_temporary_view("input_table", input_table)
time_format = "yyyy-MM-dd HH:mm:ss.SSS X"
output_table = t_env.sql_query(
f"SELECT *, UNIX_TIMESTAMP(`time`, '{time_format}') AS unix_time FROM
input_table"
)
output_table.execute().print()
{code}
The actual output is
{code}
+----+--------------------------------+--------------------------------+--------------------------------+----------------------+
| op | name | avg_cost |
time | unix_time |
+----+--------------------------------+--------------------------------+--------------------------------+----------------------+
| +I | Alex | 100.0 |
2022-01-01 08:00:00.001 +0800 | 1640995200 |
| +I | Emma | 400.0 |
2022-01-01 00:00:00.003 +0000 | 1640995200 |
| +I | Alex | 200.0 |
2022-01-01 08:00:00.005 +0800 | 1640995200 |
| +I | Emma | 300.0 |
2022-01-01 00:00:00.007 +0000 | 1640995200 |
| +I | Jack | 500.0 |
2022-01-01 08:00:00.009 +0800 | 1640995200 |
| +I | Alex | 450.0 |
2022-01-01 00:00:00.011 +0000 | 1640995200 |
+----+--------------------------------+--------------------------------+--------------------------------+----------------------+
{code}
While the expected result is
{code:java}
+----+--------------------------------+--------------------------------+--------------------------------+----------------------+
| op | name | avg_cost |
time | unix_time |
+----+--------------------------------+--------------------------------+--------------------------------+----------------------+
| +I | Alex | 100.0 |
2022-01-01 08:00:00.001 +0800 | 1640995200 |
| +I | Emma | 400.0 |
2022-01-01 00:00:00.003 +0000 | 1640966400 |
| +I | Alex | 200.0 |
2022-01-01 08:00:00.005 +0800 | 1640995200 |
| +I | Emma | 300.0 |
2022-01-01 00:00:00.007 +0000 | 1640966400 |
| +I | Jack | 500.0 |
2022-01-01 08:00:00.009 +0800 | 1640995200 |
| +I | Alex | 450.0 |
2022-01-01 00:00:00.011 +0000 | 1640966400 |
+----+--------------------------------+--------------------------------+--------------------------------+----------------------+
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)