xielongfei opened a new issue, #29215:
URL: https://github.com/apache/shardingsphere/issues/29215

   1. **Brief Description**: I have a user table split into 32 subtables. When 
querying the user list, the returned total number of records does not match the 
actual data (related to my business, please refer to the detailed description 
below).
   
   2. **Versions**:
      - Shardingsphere-jdbc 5.2.0
      - MyBatis Plus 3.3.1
   
   3. **Table Structure**:
      ```sql
      CREATE TABLE user (
        id INT,
        name VARCHAR,   -- Name
        idCard VARCHAR, -- ID Card
        areaId INT,     -- Area ID (Sharding Key)
        ...
      )
      ```
      There is one user table, divided into 32 tables based on `areaId`, 
corresponding to user_0, user_1, ..., user_31. 99% of queries can be executed 
on a single table, based on the sharding key. 
      
      However, there is a business scenario where I need to perform a global 
deduplication based on names and ID cards. This means the same name and ID card 
can exist in both user_0 and user_1.
   
   4. **Query Execution**: When I query the list, MyBatis Plus generates two 
logical SQL queries:
      ```sql
      SELECT COUNT(1) FROM ( SELECT DISTINCT r.NAME, r.idCard FROM user r WHERE 
r.areaId IN (1, 2, 3, 4) ) TOTAL;
      SELECT DISTINCT r.NAME, r.idCard FROM user r WHERE r.areaId IN (1, 2, 3, 
4);
      ```
      Actual executed SQL (I modified the sharding condition values):
      ```sql
      -- Count Queries
       SELECT COUNT(1) FROM ( SELECT DISTINCT r.NAME, r.idCard FROM user_1 r 
WHERE r.areaId IN (1) ) TOTAL;
       SELECT COUNT(1) FROM ( SELECT DISTINCT r.NAME, r.idCard FROM user_2 r 
WHERE r.areaId IN (2) ) TOTAL;
       SELECT COUNT(1) FROM ( SELECT DISTINCT r.NAME, r.idCard FROM user_3 r 
WHERE r.areaId IN (3) ) TOTAL;
       SELECT COUNT(1) FROM ( SELECT DISTINCT r.NAME, r.idCard FROM user_4 r 
WHERE r.areaId IN (4) ) TOTAL;
   
      -- Select Queries
       SELECT DISTINCT r.NAME, r.idCard FROM user_1 r WHERE r.areaId IN (1);
       SELECT DISTINCT r.NAME, r.idCard FROM user_2 r WHERE r.areaId IN (2);
       SELECT DISTINCT r.NAME, r.idCard FROM user_3 r WHERE r.areaId IN (3);
       SELECT DISTINCT r.NAME, r.idCard FROM user_4 r WHERE r.areaId IN (4);
      ```
   
   
![企业微信截图_1701055042150](https://github.com/apache/shardingsphere/assets/19742146/4895be54-9ed4-4318-8f64-253f230651ec)
   The count of 599508 is incorrect. When clicking to navigate to page 29975, 
no data is returned.
   
   5. **Problem Caused**: 
      When counting the total, Shardingsphere simply adds them up, 
   but when querying data, Shardingsphere deduplicates them through a streaming 
merge concept, leading to data inconsistency.
   
   If my description is unclear, please point it out so I can correct it. Thank 
you.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: 
[email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to