[PR] [Subtask]: In master-slave mode, database-based bucket allocation is supported. [amoro]

via GitHub Mon, 16 Mar 2026 00:08:02 -0700


wardlican opened a new pull request, #4123:
URL: https://github.com/apache/amoro/pull/4123


   
   ## Why are the changes needed?
   In master-slave mode, database-based bucket allocation is supported.The 
allocation information and node registration are directly implemented through 
the database.
   
   Close #4121
   
   ## Brief change log
   
   
   
   Here is a brief summary of the implemented changes for `DBBucketAssignStore` 
under the `HA_TYPE_DATABASE` architecture:
   
   **Summary of Changes: Implementation of DBBucketAssignStore for Database HA**
   
   *   **1. Schema Initialization (Aligned with ZK Semantics)**
       *   Added the `bucket_assignments` table to the initialization scripts 
for Derby, MySQL, and PostgreSQL (`ams-[db]-init.sql`).
       *   Defined fields including `cluster_name`, `node_key` (Primary Key, 
formatted as `host:thriftBindPort`), `server_info_json`, `assignments_json`, 
and `last_update_time` (in milliseconds).
   
   *   **2. Persistence Layer Enhancements**
       *   Introduced `BucketAssignmentMeta` entity class to map the table 
records.
       *   Created `BucketAssignMapper` using MyBatis annotations to handle 
CRUD operations (`insert`, `update`, `updateLastUpdateTime`, `selectByNode`, 
`selectAllByCluster`, `deleteByNode`).
       *   Registered the new mapper in the `SqlSessionFactoryProvider`.
   
   *   **3. Core Implementation (`DBBucketAssignStore`)**
       *   Implemented the `BucketAssignStore` interface by extending 
`PersistentBase`.
       *   **`saveAssignments`**: Implemented an "upsert" logic (attempts 
`update`, falls back to `insert` if no rows are affected) to store assignments, 
server info, and the last update time.
       *   **`getAllAssignments` / `getAssignments`**: Handles querying and 
deserializing JSON data into `Map<AmsServerInfo, List<String>>` and 
`List<String>`. Fallback logic is included to parse `host:port` from `node_key` 
if `server_info_json` parsing fails.
       *   **`removeAssignments`**: Deletes the specific row based on 
`cluster_name` and `node_key`.
       *   **Update Time Management**: Added logic for `getLastUpdateTime` and 
`updateLastUpdateTime`, including inserting a placeholder record if the row 
doesn't exist during updates.
       *   **Exception Handling**: All underlying database exceptions are 
caught and wrapped into a customized `BucketAssignStoreException`.
   
   *   **4. Factory Integration & Auto-DDL**
       *   Updated `BucketAssignStoreFactory` to correctly instantiate 
`DBBucketAssignStore(clusterName)` when `ha.type=database`, removing the 
previous `UnsupportedOperationException`.
       *   Relies on the existing `DB_AUTO_CREATE_TABLES=true` logic to 
automatically execute the initialization SQL scripts for table creation.
   
   -
   
   ## How was this patch tested?
   
   - [ ] Add some test cases that check the changes thoroughly including 
negative and positive cases if possible
   
   - [ ] Add screenshots for manual tests if appropriate
   
   - [ ] Run test locally before making a pull request
   
   ## Documentation
   
   - Does this pull request introduce a new feature? (yes / no)
   - If yes, how is the feature documented? (not applicable / docs / JavaDocs / 
not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] [Subtask]: In master-slave mode, database-based bucket allocation is supported. [amoro]

Reply via email to