shounakmk219 opened a new pull request, #17632:
URL: https://github.com/apache/pinot/pull/17632
## Summary
Adds **METADATA** (and **URI**) segment push mode to
`BaseSingleSegmentConversionExecutor` so single-segment conversion tasks can
push via metadata (segment on output PinotFS + metadata to controller) instead
of only TAR (HTTP upload). Moves shared segment-push logic into
`BaseTaskExecutor` so both single- and multiple-segment conversion executors
reuse the same helpers.
## Changes
### BaseTaskExecutor
- Added shared segment push constants: `SEGMENT_PUSH_DEFAULT_ATTEMPTS`,
`SEGMENT_PUSH_DEFAULT_PARALLELISM`,
`SEGMENT_PUSH_DEFAULT_RETRY_INTERVAL_MILLIS`.
- Added protected helpers used by both conversion executors:
- `getPushJobSpec(configs)` – builds `PushJobSpec` from task config.
- `generateSegmentGenerationJobSpec(tableName, configs, pushJobSpec)` –
builds `SegmentGenerationJobSpec` for controller push.
- `moveSegmentToOutputPinotFS(configs, localSegmentTarFile)` – copies
local segment tar to output PinotFS; requires `output.segment.dir.uri`.
- `getSegmentPushCommonParams(tableNameWithType)` – common HTTP params
(parallel push protection, table name, table type).
- `getSegmentPushMetadataHeaders(pinotTaskConfig, authProvider,
segmentConversionResult)` – headers for metadata push (ZK metadata modifier +
auth).
- `getSegmentPushType(configs)` – resolves push mode from config
(`push.mode`, default TAR); subclasses can override.
### BaseSingleSegmentConversionExecutor
- Supports configurable push mode (TAR vs METADATA/URI) via
`getSegmentPushType(configs)` (overridable).
- After conversion, upload step branches on push type: **TAR** → existing
HTTP upload; **METADATA/URI** → move segment to output PinotFS (when
`output.segment.dir.uri` and `push.controllerUri` are set) and call
`SegmentPushUtils.sendSegmentUriAndMetadata`.
- Uses base `getSegmentPushCommonParams` for upload parameters and the new
base helpers for the METADATA path (no duplicated push logic).
### BaseMultipleSegmentsConversionExecutor
- Removed duplicate push logic: dropped local constants and implementations
of `getPushJobSpec`, `generateSegmentGenerationJobSpec`,
`moveSegmentToOutputPinotFS`, and `getSegmentPushCommonParams`; uses base
implementations.
- `getSegmentPushCommonHeaders` now delegates to base
`getSegmentPushMetadataHeaders`.
- `pushSegment` uses base `getSegmentPushType(taskConfigs)` for resolving
push mode.
### SegmentGenerationAndPushTaskExecutor
- `moveSegmentToOutputPinotFS` overrides to handle missing
`output.segment.dir.uri` (return local file URI) and delegates to
`super.moveSegmentToOutputPinotFS` when the config is present.
## Usage (single-segment METADATA push)
- Set task config `push.mode` = `METADATA` (or override `getSegmentPushType`
in a subclass).
- Provide `output.segment.dir.uri` and `push.controllerUri` in task config
for the METADATA path.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]