weiqingy opened a new pull request, #843: URL: https://github.com/apache/flink-agents/pull/843
Linked issue: #280 ### Purpose of change Today `output_schema` is honored only by prompt-engineering the request and parsing the response text; no chat-model integration uses a provider's native structured-output API, which is the most reliable strategy. This PR adds the foundation for native structured output at the chat-model connection layer, plus the OpenAI implementation, in both Java and Python. It is the first in a small stack under #280 (Azure/Ollama, Anthropic, and DashScope follow in separate PRs; wiring native output into the ReActAgent final-output flow is a separate follow-up). How it works: - The request's output schema is carried to the connection through a reserved key (`__structured_output_schema__`) in the existing `modelParams`/`kwargs` map, so the abstract `chat()` signature is unchanged. - Each connection declares a boolean native-structured-output capability (`supportsNativeStructuredOutput()` / `supports_native_structured_output`), default `false`. - A connection applies the native API only when: a schema is present, no tools are bound on the call, the schema is a POJO (Java) / `BaseModel` (Python) rather than a `RowTypeInfo`, and the setup is same-language. - The reserved key is always removed before the SDK call so it cannot leak into a provider request. - The prompt-engineered path is retained as the fallback and is unaffected. In the ReAct loop tools are always bound, so the native path stays dormant there and existing behavior is unchanged. OpenAI applies `response_format` json_schema with strict validation. The other connections only strip the reserved key for now; their native paths arrive in later PRs. The same-language guard matters because native structured output cannot work across the language boundary (a Java `Class` is not a Python `BaseModel`), so the schema object is never marshaled across the Pemja bridge. ### Tests Unit tests with the provider SDK mocked (no network): - OpenAI native applied when a schema is present and no tools are bound (Java + Python); the SDK request carries `response_format` json_schema strict. - Native NOT applied when tools are bound (the no-regression gate), and NOT applied for a `RowTypeInfo` schema. - The reserved key never leaks to a provider SDK (Python connections forward `**kwargs`, so each strips it; a direct unit test of the base pop helper covers removal). - Same-language threading guard: a cross-language setup with an `output_schema` does not receive the reserved key; a same-language setup does. - Existing ReActAgent prompt-path tests remain green unchanged. ### API Yes — additive only. `BaseChatModelConnection` gains a public reserved-key constant and a `protected` capability method (default `false`); no existing signatures change. ### Documentation - [ ] `doc-needed` - [x] `doc-not-needed` - [ ] `doc-included` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
