Hi, Community

Recently, the community plans to reconstruct the API of DS in
https://github.com/apache/dolphinscheduler/pull/15321/. The `listXXX` API
generally has two methods for pagination: Offset-based pagination or
Cursor-based (token-based) pagination. Currently, DS uses the offset-based
pagination (e.g., pageSize and pageNo). When designing the v2 API, I think
we need to discuss what paging method to use.

### Offset-based Pagination
Users can see page numbers directly on the web UI and jump between
different pages. But there are 2 problems with this approach:
* Data conflicts and loss: If a data item is inserted or deleted during
paging, users may see duplicate data items on the next page, or data item
may be lost.
* Performance issue: When the offset is large, query efficiency will become
lower

### Cursor-based Pagination
Instead of relying on numeric offsets, cursor-based pagination uses a
unique identifier or token to mark the position in the dataset. Token-based
pagination does not have data conflict/loss or performance issue. And cloud
vendors (e.g., AWS / Azure / GCP) also use cursor-based pagination
extensively.

### An example of Cursor-based Pagination
* Assume table `records` is sorted using `id` and `update_time`
* Assume that a page returns 10 data items, then take the `id` and
`update_time` of the last data item as the cursor (usually encoded instead
of returned to the user in plain text).
* The user uses the cursor obtained from the previous query to initiate a
second paging query, and DS decodes it to obtain the `last_data.id` and
`last_data.update_time` of the last piece of data in the previous query.
```
SELECT *
FROM
    records
WHERE
    user_id = xxx
    AND (
        update_time < last_data.update_time
        OR ( update_time = last_data.update_time AND id < last_data.id )
    )
ORDER BY
    update_time DESC, id DESC
LIMIT 10;
```

I prefer to use cursor-based pagination in the v2 API of DS since it does
not have data conflict/loss or performance issue. And I don't think users
have a strong need to jump directly to a specific page. Any comments or
suggestions are welcome.

Ref:
[1] [Offset vs Cursor-Based Pagination: Which is the Right Choice for Your
Project?](
https://medium.com/@oshiryaeva/offset-vs-cursor-based-pagination-which-is-the-right-choice-for-your-project-e46f65db062f
)
[2] [Offset vs Cursor-Based Pagination: Choosing the Best Approach](
https://medium.com/@maryam-bit/offset-vs-cursor-based-pagination-choosing-the-best-approach-2e93702a118b
)


Best Regards,
Rick Cheng

Reply via email to