Hi community,

I have a job that exercises top-n query as described in
https://ci.apache.org/projects/flink/flink-docs-release-1.13/zh/docs/dev/table/sql/queries/topn/.
You can find the job in
https://github.com/YikSanChan/pyflink-quickstart/blob/26e7c09aaa1167b13b981586ffb0e7d6bb6cf053/topk.py

What I find interesting:
- In local mode, i.e. `python topk.py`, I am not able to get consistent
output.
- In standalone mode, i.e., `~/softwares/flink-1.12.0/bin/flink run -d
-pyexec /usr/local/anaconda3/envs/pyflink-quickstart/bin/python -py
topk.py`, I always get consistent output.

By consistent output, I mean the output order is strictly respected:

```
+I(1,100)
-U(1,100)
+U(1,100,101)
+I(2,201)
-U(2,201)
+U(2,201,200)
-U(2,201,200)
+U(2,201,200,202)
-U(2,201,200,202)
+U(2,201,202)
-U(2,201,202)
+U(2,201,202,203)
-U(1,100,101)
+U(1,100,101,102)
-U(1,100,101,102)
+U(1,101,102)
-U(1,101,102)
+U(1,101,102,103)
-U(1,101,102,103)
+U(1,102,103)
-U(1,102,103)
+U(1,102,103,104)
-U(2,201,202,203)
+U(2,202,203)
-U(2,202,203)
+U(2,202,203,204)
```

While `python topk.py` gives a somewhat random (and not reasonable) output
everytime I run it, for example,

```
6> +I(1,102)
6> +I(2,202)
6> -U(2,202)
6> +U(2,202,203)
6> -U(1,102)
6> +U(1,102,104)
6> -U(1,102,104)
6> +U(1,102,104,103)
6> -U(2,202,203)
6> +U(2,202,203,201)
6> -U(2,202,203,201)
6> +U(2,202,203)
6> -U(2,202,203)
6> +U(2,202,203,204)
```

Note that I use the default flink-conf.yaml, which means the slot number is
always 1. Any idea why?

Best,
Yik San

Reply via email to