abhishekmjain opened a new pull request, #4179: URL: https://github.com/apache/gobblin/pull/4179
Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [x] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. - N/A - internal improvement to avoid bulk DB updates ### Description - [x] Here are some details about my PR: When `gobblin.service.flowConcurrencyAllowed` is `false` at service level and a flow does not have `flow.allowConcurrentExecution` explicitly set in its config, there is no way to selectively enable concurrency for a subset of flows without updating each flow individually in the DB. With 80k+ flows, bulk updates would overwhelm the Brooklin CDC stream. This PR adds a new service-level config `gobblin.service.flowConcurrencyAllowed.flowGroupPrefixes` — a comma-separated list of flow group prefixes. Flows whose group matches any prefix will default to allowing concurrent execution. **Concurrency resolution order:** 1. Per-flow `flow.allowConcurrentExecution` if explicitly set → use it 2. Service-level `flowConcurrencyAllowed` if `true` → allow 3. Flow group prefix match via `flowGroupPrefixes` → allow if matched **Example config:** ``` gobblin.service.flowConcurrencyAllowed.flowGroupPrefixes=teamA,teamB-prod ``` **Files changed:** - `ServiceConfigKeys.java` — new `FLOW_CONCURRENCY_ALLOWED_FLOWGROUP_PREFIXES` config key - `FlowCompilationValidationHelper.java` — extracted `resolveAllowConcurrentExecution()` with 3-tier resolution; parses prefixes with trim/blank filtering - `FlowCompilationValidationHelperTest.java` — 8 new tests covering all resolution paths and edge cases (blank/double-comma prefixes) ### Tests - [x] My PR adds the following unit tests: - `testResolveAllowConcurrentExecution_explicitFlowConfigTrue` — per-flow true wins over service-level false - `testResolveAllowConcurrentExecution_explicitFlowConfigFalse` — per-flow false wins over service-level true - `testResolveAllowConcurrentExecution_noFlowConfig_serviceLevelTrue` — service-level true allows without prefix check - `testResolveAllowConcurrentExecution_noFlowConfig_serviceLevelFalse_prefixMatch` — prefix matching works - `testResolveAllowConcurrentExecution_noFlowConfig_serviceLevelFalse_noPrefixMatch` — non-matching prefix disallows - `testResolveAllowConcurrentExecution_noFlowConfig_serviceLevelFalse_noPrefixesConfigured` — no prefixes = disallow - `testResolveAllowConcurrentExecution_noFlowConfig_serviceLevelFalse_blankPrefixesIgnored` — empty/blank entries don't match all groups - `testResolveAllowConcurrentExecution_explicitFlowConfigFalse_overridesPrefixMatch` — per-flow false overrides prefix match ### Commits - [x] My commits follow the guidelines from "How to write a good git commit message" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
