aglinxinyuan opened a new issue, #5708: URL: https://github.com/apache/texera/issues/5708
### Feature Summary The Python worker maps worker actor names of the form `Worker:WF<workflowId>-<operatorId>-<layerName>-<workerIndex>` back to identifiers with a regex in `amber/src/main/python/core/util/virtual_identity.py`. There is no single helper to extract the **logical operator id** from a worker name, and the existing `get_worker_index` uses `re.match` (start-anchored only), so a malformed worker id with trailing junk parses silently — contradicting the docstring's claim that it mirrors the Scala `VirtualIdentityUtils.getPhysicalOpId` parse, which requires a full match. ### Proposed Solution or Design - Add `get_operator_id(worker_id) -> str` returning the logical operator id, raising `ValueError` on a malformed id. - Generalize `worker_name_pattern` to capture the workflow id and operator id explicitly. - Switch both `get_worker_index` and `get_operator_id` to `re.fullmatch`, so a malformed id fails loudly instead of parsing silently. Behavior-preserving for well-formed worker ids (`get_worker_index` still returns the final capture group). `get_operator_id` gains its production caller with the for-loop feature. ### Affected Area - Workflow Engine (Amber) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
