featzhang created FLINK-39629:
---------------------------------

             Summary: Add Async I/O operator that invokes GPU sidecar from 
DataStream API
                 Key: FLINK-39629
                 URL: https://issues.apache.org/jira/browse/FLINK-39629
             Project: Flink
          Issue Type: Sub-task
          Components: API / DataStream
            Reporter: featzhang


h2. Background

With a functional sidecar that batches inference requests, user programs
written against the DataStream API need a convenient, correctly ordered,
back-pressure-friendly way to invoke the sidecar. Flink's existing Async
I/O subsystem ({{AsyncDataStream}}, {{AsyncFunction}},
{{RichAsyncFunction}}) already provides the required machinery for
ordered / unordered result emission, timeouts, and capacity limits.

This sub-task adds a first-party async function that delegates record-by-
record inference to the sidecar.

h2. Scope of this sub-task

* Add {{GpuSidecarAsyncFunction<IN, OUT>}} in a new package under
 {{flink-streaming-java}} or a dedicated {{flink-gpu-client}} module.
 Constructor parameters:
** Sidecar endpoint (UDS or TCP).
** Request timeout.
** Maximum inflight requests (capacity).
** A pluggable {{TensorCodec<IN, OUT>}} that converts records to / from
 the sidecar's tensor wire format.
* Manage one RPC channel per parallel subtask; reconnect with exponential
 backoff.
* Honour Flink checkpointing: inflight requests are drained on checkpoint
 barriers when ordered emission is requested.
* Surface the sidecar's structured error codes as
 {{AsyncFunction}}-level failures so the user program can apply Flink's
 standard restart strategies.

h2. Out of scope

* Table / SQL integration (tracked in a later sub-task).
* Scheduling operators onto nodes with a live sidecar (tracked in a later
 sub-task).
* Model-hot-reload UX on the DataStream side.

h2. Acceptance criteria

* End-to-end test: a job with a source, a
 {{GpuSidecarAsyncFunction}}, and a sink runs against the mock sidecar
 on CI and produces deterministic output under ordered emission.
* Timeout and capacity settings behave as documented.
* Clean shutdown on job cancellation; no connection leak.

h2. Affected modules

* {{flink-streaming-java}} (or new {{flink-gpu-client}})
* {{flink-tests}} (integration test against mock sidecar)

h2. Links

Parent: see umbrella issue linked to this sub-task.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to