AlejandroMorgante commented on code in PR #68479: URL: https://github.com/apache/airflow/pull/68479#discussion_r3428025216
########## providers/google/src/airflow/providers/google/cloud/hooks/vertex_ai/agent_engine.py: ########## @@ -0,0 +1,254 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +"""This module contains a Google Cloud Vertex AI Agent Engine hook.""" + +from __future__ import annotations + +import json +import time +from collections.abc import Sequence +from typing import Any + +from asgiref.sync import sync_to_async +from google.genai._api_client import HttpOptions +from google.genai.errors import ClientError +from vertexai import Client + +from airflow.providers.google.common.hooks.base_google import ( + PROVIDE_PROJECT_ID, + GoogleBaseAsyncHook, + GoogleBaseHook, +) + + +class AgentEngineHook(GoogleBaseHook): + """Hook for Google Cloud Vertex AI Agent Engine APIs.""" + + def __init__( + self, + gcp_conn_id: str = "google_cloud_default", + impersonation_chain: str | Sequence[str] | None = None, + **kwargs, + ) -> None: + super().__init__( + gcp_conn_id=gcp_conn_id, + impersonation_chain=impersonation_chain, + **kwargs, + ) + + def get_agent_engine_client(self, project_id: str, location: str): + """Return the Vertex AI Agent Engine client.""" + return Client( + project=project_id, + location=location, + credentials=self.get_credentials(), + ).agent_engines + + @GoogleBaseHook.fallback_to_default_project_id + def create_agent_engine( + self, + location: str, + agent: Any | None = None, + agent_engine: Any | None = None, + config: Any | None = None, + project_id: str = PROVIDE_PROJECT_ID, + ) -> Any: + """ + Create an Agent Engine. + + :param location: Required. The ID of the Google Cloud location that the service belongs to. + :param agent: Optional. The agent object to deploy. + :param agent_engine: Optional. Deprecated alias for ``agent``. + :param config: Optional. Configuration for the Agent Engine. + :param project_id: Optional. The ID of the Google Cloud project. Defaults to the project + configured in the connection. + """ + client = self.get_agent_engine_client(project_id=project_id, location=location) + return client.create(agent=agent, agent_engine=agent_engine, config=config) + + @GoogleBaseHook.fallback_to_default_project_id + def get_agent_engine( + self, + location: str, + name: str, + project_id: str = PROVIDE_PROJECT_ID, + ) -> Any: + """ + Get an Agent Engine. + + :param location: Required. The ID of the Google Cloud location that the service belongs to. + :param name: Required. The Agent Engine resource name. + :param project_id: Optional. The ID of the Google Cloud project. Defaults to the project + configured in the connection. + """ + client = self.get_agent_engine_client(project_id=project_id, location=location) + return client.get(name=name) + + @GoogleBaseHook.fallback_to_default_project_id + def query_agent_engine( Review Comment: I agree that `run_query_job` is a synchronous Python method call, but it represents a different async job workflow. It requires `output_gcs_uri`, writes input/output through GCS, and returns `RunQueryJobResult` with job/GCS metadata. It does not return the direct synchronous `{reasoningEngine}:query` response. `QueryAgentEngineOperator` is intended for the direct request-response query use case, where the task waits for the Agent Engine response and returns the query output. In this implementation, the operator returns `response["output"]` when that field is present and not `None`; otherwise it returns the full response dict to avoid dropping response metadata. A `run_query_job`-based flow would make sense as a separate operator because it has different required parameters and different return semantics. If the concern about relying on private SDK internals is blocking, I can remove the synchronous `QueryAgentEngineOperator` for now and keep only a public-SDK-based query-job operator using `run_query_job`. The tradeoff is that this would not cover the direct request-response `{reasoningEngine}:query` use case yet. Users would need to provide `output_gcs_uri`, and the operator would return job/GCS metadata rather than the direct query output. We could add the synchronous query operator later once the SDK exposes a supported public path for that endpoint. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
