abhishekmjain opened a new pull request, #4179:
URL: https://github.com/apache/gobblin/pull/4179

   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I 
have checked off all the steps below!
   
   
   ### JIRA
   - [x] My PR addresses the following [Gobblin 
JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references 
them in the PR title.
       - N/A - internal improvement to avoid bulk DB updates
   
   ### Description
   - [x] Here are some details about my PR:
   
   When `gobblin.service.flowConcurrencyAllowed` is `false` at service level 
and a flow does not have `flow.allowConcurrentExecution` explicitly set in its 
config, there is no way to selectively enable concurrency for a subset of flows 
without updating each flow individually in the DB. With 80k+ flows, bulk 
updates would overwhelm the Brooklin CDC stream.
   
   This PR adds a new service-level config 
`gobblin.service.flowConcurrencyAllowed.flowGroupPrefixes` — a comma-separated 
list of flow group prefixes. Flows whose group matches any prefix will default 
to allowing concurrent execution.
   
   **Concurrency resolution order:**
   1. Per-flow `flow.allowConcurrentExecution` if explicitly set → use it
   2. Service-level `flowConcurrencyAllowed` if `true` → allow
   3. Flow group prefix match via `flowGroupPrefixes` → allow if matched
   
   **Example config:**
   ```
   gobblin.service.flowConcurrencyAllowed.flowGroupPrefixes=teamA,teamB-prod
   ```
   
   **Files changed:**
   - `ServiceConfigKeys.java` — new 
`FLOW_CONCURRENCY_ALLOWED_FLOWGROUP_PREFIXES` config key
   - `FlowCompilationValidationHelper.java` — extracted 
`resolveAllowConcurrentExecution()` with 3-tier resolution; parses prefixes 
with trim/blank filtering
   - `FlowCompilationValidationHelperTest.java` — 8 new tests covering all 
resolution paths and edge cases (blank/double-comma prefixes)
   
   ### Tests
   - [x] My PR adds the following unit tests:
       - `testResolveAllowConcurrentExecution_explicitFlowConfigTrue` — 
per-flow true wins over service-level false
       - `testResolveAllowConcurrentExecution_explicitFlowConfigFalse` — 
per-flow false wins over service-level true
       - `testResolveAllowConcurrentExecution_noFlowConfig_serviceLevelTrue` — 
service-level true allows without prefix check
       - 
`testResolveAllowConcurrentExecution_noFlowConfig_serviceLevelFalse_prefixMatch`
 — prefix matching works
       - 
`testResolveAllowConcurrentExecution_noFlowConfig_serviceLevelFalse_noPrefixMatch`
 — non-matching prefix disallows
       - 
`testResolveAllowConcurrentExecution_noFlowConfig_serviceLevelFalse_noPrefixesConfigured`
 — no prefixes = disallow
       - 
`testResolveAllowConcurrentExecution_noFlowConfig_serviceLevelFalse_blankPrefixesIgnored`
 — empty/blank entries don't match all groups
       - 
`testResolveAllowConcurrentExecution_explicitFlowConfigFalse_overridesPrefixMatch`
 — per-flow false overrides prefix match
   
   ### Commits
   - [x] My commits follow the guidelines from "How to write a good git commit 
message"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to