shounakmk219 opened a new pull request, #17632:
URL: https://github.com/apache/pinot/pull/17632

   ## Summary
   Adds **METADATA** (and **URI**) segment push mode to 
`BaseSingleSegmentConversionExecutor` so single-segment conversion tasks can 
push via metadata (segment on output PinotFS + metadata to controller) instead 
of only TAR (HTTP upload). Moves shared segment-push logic into 
`BaseTaskExecutor` so both single- and multiple-segment conversion executors 
reuse the same helpers.
   
   ## Changes
   
   ### BaseTaskExecutor
   - Added shared segment push constants: `SEGMENT_PUSH_DEFAULT_ATTEMPTS`, 
`SEGMENT_PUSH_DEFAULT_PARALLELISM`, 
`SEGMENT_PUSH_DEFAULT_RETRY_INTERVAL_MILLIS`.
   - Added protected helpers used by both conversion executors:
     - `getPushJobSpec(configs)` – builds `PushJobSpec` from task config.
     - `generateSegmentGenerationJobSpec(tableName, configs, pushJobSpec)` – 
builds `SegmentGenerationJobSpec` for controller push.
     - `moveSegmentToOutputPinotFS(configs, localSegmentTarFile)` – copies 
local segment tar to output PinotFS; requires `output.segment.dir.uri`.
     - `getSegmentPushCommonParams(tableNameWithType)` – common HTTP params 
(parallel push protection, table name, table type).
     - `getSegmentPushMetadataHeaders(pinotTaskConfig, authProvider, 
segmentConversionResult)` – headers for metadata push (ZK metadata modifier + 
auth).
     - `getSegmentPushType(configs)` – resolves push mode from config 
(`push.mode`, default TAR); subclasses can override.
   
   ### BaseSingleSegmentConversionExecutor
   - Supports configurable push mode (TAR vs METADATA/URI) via 
`getSegmentPushType(configs)` (overridable).
   - After conversion, upload step branches on push type: **TAR** → existing 
HTTP upload; **METADATA/URI** → move segment to output PinotFS (when 
`output.segment.dir.uri` and `push.controllerUri` are set) and call 
`SegmentPushUtils.sendSegmentUriAndMetadata`.
   - Uses base `getSegmentPushCommonParams` for upload parameters and the new 
base helpers for the METADATA path (no duplicated push logic).
   
   ### BaseMultipleSegmentsConversionExecutor
   - Removed duplicate push logic: dropped local constants and implementations 
of `getPushJobSpec`, `generateSegmentGenerationJobSpec`, 
`moveSegmentToOutputPinotFS`, and `getSegmentPushCommonParams`; uses base 
implementations.
   - `getSegmentPushCommonHeaders` now delegates to base 
`getSegmentPushMetadataHeaders`.
   - `pushSegment` uses base `getSegmentPushType(taskConfigs)` for resolving 
push mode.
   
   ### SegmentGenerationAndPushTaskExecutor
   - `moveSegmentToOutputPinotFS` overrides to handle missing 
`output.segment.dir.uri` (return local file URI) and delegates to 
`super.moveSegmentToOutputPinotFS` when the config is present.
   
   ## Usage (single-segment METADATA push)
   - Set task config `push.mode` = `METADATA` (or override `getSegmentPushType` 
in a subclass).
   - Provide `output.segment.dir.uri` and `push.controllerUri` in task config 
for the METADATA path.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to