GitHub user Yicong-Huang added a comment to the discussion: Support Batch Execution Mode
In general I like the idea of supporting multiple execution modes, in one engine. Some things to consider: 1. I don't think it is a good idea to call it batch vs streaming. They are different concepts. "batch" is for bounded and complete data. "streaming" is for unbounded and continued data. it is more of `operator-at-a-time` and `tuple-at-a-time`. Maybe you can call them `Materialized` mode vs `Pipelined` mode. 2. please correct me if I am wrong, the backend difference is on the region size. the materialized mode is equivalent to create a region per operator, while the pipelined mode is just using pasta to cut regions. If so, I actually think it might be better to define a parameter called region size with a sliding bar, where a user can choose from `1 operator/region`, `2 operator/region`, all the way to auto operator/region (i.e., pasta). this gives more flexility to configure and can include your two modes as well. Also would be more easier to understand to users: in their language, it's a matter of how many operators can execute "at the same time". GitHub link: https://github.com/apache/texera/discussions/4149#discussioncomment-15511774 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
