Adding another API that uses cursor-based pagination is a good choice. good job!
Best Regards --------------- Apache DolphinScheduler PMC Chair & Apache SeaTunnel PMC member David Linkedin: https://www.linkedin.com/in/davidzollo Twitter: @WorkflowEasy <https://twitter.com/WorkflowEasy> --------------- On Tue, Mar 12, 2024 at 11:53 AM Xin Cheng <xinch...@apache.org> wrote: > Thanks a lot for your reply. I agree with your point about performance > issues. The amount of data in DS is indeed not very large, except for > querying the task instance list. > > Also, cursor-based pagination does not have data conflict/loss issues. If > the community thinks that this problem has little impact, I think v2-api > can continue to maintain the original paging method. And for APIs that > return a larger amount of data (e.g., task instance list), we can add > another API that uses cursor-based pagination. > > Best Regards, > Rick Cheng > > On 2024/03/08 03:30:45 David Zollo wrote: > > I think offset-based pagination is more convenient for users, usually > > users don't have too many pages and it's convenient for users to use open > > api. > > > > Cursor-Based Pagination is proper for much more data. but it's not a good > > choice for scheduling > > > > > > > > Best Regards > > > > --------------- > > Apache DolphinScheduler PMC Chair & Apache SeaTunnel PMC member > > David > > Linkedin: https://www.linkedin.com/in/davidzollo > > Twitter: @WorkflowEasy <https://twitter.com/WorkflowEasy> > > --------------- > > > > > > On Wed, Feb 21, 2024 at 10:49 AM Rick Cheng <rickche...@gmail.com> > wrote: > > > > > Hi, Community > > > > > > Recently, the community plans to reconstruct the API of DS in > > > https://github.com/apache/dolphinscheduler/pull/15321/. The `listXXX` > API > > > generally has two methods for pagination: Offset-based pagination or > > > Cursor-based (token-based) pagination. Currently, DS uses the > offset-based > > > pagination (e.g., pageSize and pageNo). When designing the v2 API, I > think > > > we need to discuss what paging method to use. > > > > > > ### Offset-based Pagination > > > Users can see page numbers directly on the web UI and jump between > > > different pages. But there are 2 problems with this approach: > > > * Data conflicts and loss: If a data item is inserted or deleted during > > > paging, users may see duplicate data items on the next page, or data > item > > > may be lost. > > > * Performance issue: When the offset is large, query efficiency will > become > > > lower > > > > > > ### Cursor-based Pagination > > > Instead of relying on numeric offsets, cursor-based pagination uses a > > > unique identifier or token to mark the position in the dataset. > Token-based > > > pagination does not have data conflict/loss or performance issue. And > cloud > > > vendors (e.g., AWS / Azure / GCP) also use cursor-based pagination > > > extensively. > > > > > > ### An example of Cursor-based Pagination > > > * Assume table `records` is sorted using `id` and `update_time` > > > * Assume that a page returns 10 data items, then take the `id` and > > > `update_time` of the last data item as the cursor (usually encoded > instead > > > of returned to the user in plain text). > > > * The user uses the cursor obtained from the previous query to > initiate a > > > second paging query, and DS decodes it to obtain the `last_data.id` > and > > > `last_data.update_time` of the last piece of data in the previous > query. > > > ``` > > > SELECT * > > > FROM > > > records > > > WHERE > > > user_id = xxx > > > AND ( > > > update_time < last_data.update_time > > > OR ( update_time = last_data.update_time AND id < last_data.id > ) > > > ) > > > ORDER BY > > > update_time DESC, id DESC > > > LIMIT 10; > > > ``` > > > > > > I prefer to use cursor-based pagination in the v2 API of DS since it > does > > > not have data conflict/loss or performance issue. And I don't think > users > > > have a strong need to jump directly to a specific page. Any comments or > > > suggestions are welcome. > > > > > > Ref: > > > [1] [Offset vs Cursor-Based Pagination: Which is the Right Choice for > Your > > > Project?]( > > > > > > > https://medium.com/@oshiryaeva/offset-vs-cursor-based-pagination-which-is-the-right-choice-for-your-project-e46f65db062f > > > ) > > > [2] [Offset vs Cursor-Based Pagination: Choosing the Best Approach]( > > > > > > > https://medium.com/@maryam-bit/offset-vs-cursor-based-pagination-choosing-the-best-approach-2e93702a118b > > > ) > > > > > > > > > Best Regards, > > > Rick Cheng > > > > > >