dlmarion commented on PR #3262:
URL: https://github.com/apache/accumulo/pull/3262#issuecomment-1829805334

   > When the set of managers and tables are steady for a bit, all manager 
processes need to arrive at the same decisions for partitioning tables into 
buckets. With the algorithm in this method different manager processes may see 
different counts for the same tables at different times and end up partitioning 
tables into different buckets. This could lead to overlap in the partitions or 
in the worst case a table that no manager processes. We could start with a 
deterministic hash partitioning of tables and open a follow in issue to 
improve. One possible way to improve would be to have a single manager process 
run this algorithm and publish the partitioning information, with all other 
manager just using it.
   
   > This would be a follow on issue, thinking we could distribute the 
compaction coordinator by having it hash parition queue names. among manager 
processes. TGW could make an RPC to add a job to a remote queue. Compaction 
coordinators could hash the name to find the manager process to ask for work.
   
   > We may need to make the EventCoordinator use the same partitioning as the 
TGW and send events to other manager processes via a new async RPC. Need to 
analyze the EventCoordinator, may make sense to pull it in to the TGW 
conceptually. Every manager uses it local TGW instance to signal events and 
internally the TGW code knows how to route that in the cluster to other TGW 
instances.
   
   I'm now concerned that this is going to be overly complex - lot's of moving 
parts with the potential for multiple managers to claim ownership of the same 
object, or using some external process (ZK) to coordinate which Manager is 
responsible for a specific object. The Multiple Manager implementation in this 
PR is based off 
[this](https://cwiki.apache.org/confluence/display/ACCUMULO/Elasticity+Design+Notes+-+March+2023)
 design, which has multiple managers try to manage everything. 
   
   I think there may be a simpler way as we have already introduced a natural 
partitioning mechanism - resource groups. I went back and looked in the wiki 
and you (@keith-turner ) had a very similar idea at the bottom of 
[this](https://cwiki.apache.org/confluence/display/ACCUMULO/Implementing+multiple+managers+via+independant+distributed+services)
 page. So, instead of having a single set of Managers try to manage everything, 
you would have a single Manager manage tablets, compactions, and Fate for all 
of the tables that map to a specific resource group. We could continue to have 
the active/backup Manager feature that we have today, but per resource group. 
This also solves the Monitor problem. If we look at this using the 
`cluster.yaml` file it would go from what we have today:
   
   ```
   manager:
     - localhost
   
   monitor:
     - localhost
   
   gc:
     - localhost
   
   tserver:
     default:
       - localhost
     group1:
       - localhost
   
   compactor:
     accumulo_meta:
       - localhost
     user_small:
       - localhost
     user_large:
       - localhost
   
   sserver:
     default:
       - localhost
     group1:
       - localhost    
   ```
   
   to something like:
   
   ```
   default:
     manager:
       - localhost
     monitor:
       - localhost
     gc:
       - localhost
     tserver:
       - localhost
     compactor:
       accumulo_meta:
         - localhost
       user_small:
         - localhost
       user_large:
         - localhost
     sserver:
       default:
         - localhost
         
   group1:
     manager:
       - localhost
     monitor:
       - localhost
     gc:
       - localhost
     tserver:
       - localhost
     compactor:
       accumulo_meta:
         - localhost
       user_small:
         - localhost
       user_large:
         - localhost
     sserver:
       default:
         - localhost
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to