The GitHub Actions job "Required Checks" on texera.git/main has failed.
Run started by GitHub user github-merge-queue[bot] (triggered by 
github-merge-queue[bot]).

Head commit for run:
94da3d93875f63179fa0ae92d4936155dffba68c / Ryan Zhang 
<[email protected]>
feat(python-notebook-migration): add LLM client for notebook-to-workflow 
conversion (#5260)

### What changes were proposed in this PR?
Introduces the frontend LLM session class that converts a Jupyter
notebook into a Texera workflow JSON plus a bidirectional cell to
operator mapping, along with the prompt library it uses. Two files under
`frontend/src/app/workspace/service/notebook-migration/`, totalling ~700
lines (~410 of which is prompt text).

**`migration-llm.ts`** — defines `NotebookMigrationLLM`, an
`@Injectable` class wrapping a Vercel AI SDK chat session against the
LiteLLM proxy already exposed on `main` at `/api/chat/completion`.
- `initialize(modelType, apiKey)` — builds an OpenAI-compatible chat
client via `createOpenAI({ baseURL: AppSettings.getApiEndpoint() })`,
seeds the message history with Texera documentation as `system`
messages.
- `verifyConnection()` — does a 10-token `ping` call to validate that
the API key works against the configured model.
- `convertNotebookToWorkflow(notebook)` — extracts code cells (each
tagged with a UUID in `metadata.uuid`), sends `WORKFLOW_PROMPT` + the
notebook to get a JSON of UDF operators / edges, then sends
`MAPPING_PROMPT` to get the cell↔operator mapping. Assembles a complete
Texera workflow JSON (`PythonUDFV2` operators with stub input/output
ports, links derived from the LLM's edge list, default settings) plus a
bidirectional `operator_to_cell` / `cell_to_operator` mapping. Returns
both as a JSON string.
  - `close()` — clears the message history and the model reference.

**`migration-prompts.ts`** — string constants used by
`migration-llm.ts`: `TEXERA_OVERVIEW`, `TUPLE_DOCUMENTATION`,
`TABLE_DOCUMENTATION`, `OPERATOR_DOCUMENTATION`,
`UDF_INPUT_PORT_DOCUMENTATION`, `EXAMPLE_OF_GOOD_CONVERSION`,
`VISUALIZER_DOCUMENTATION`, `EXAMPLE_OF_MULTIPLE_UDF_CONVERSION`,
`WORKFLOW_PROMPT`, `MAPPING_PROMPT`.

### Any related issues, documentation, discussions?
Closes #5259 
Parent issue #4301 


### How was this PR tested?
No unit tests were included for these reasons:
- A large portion of the changes are prompt text, which are not
testable, only readable. However the prompt text can be changed to
improve the performance of the LLM.
- Testing would require mocking a significant amount of logic that will
be introduced in later PRs, since the logic in `migration-llm.ts` is
parsing a response.

However I am open to writing tests based on review feedback.


### Was this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Claude Opus 4.7)

---------

Co-authored-by: Meng Wang <[email protected]>

Report URL: https://github.com/apache/texera/actions/runs/28193325589

With regards,
GitHub Actions via GitBox

Reply via email to