GitHub user wenjin272 closed a discussion: Prompt & Tool & ChatModel Design

## 1. Introduce

This article will introduce the Prompt, Tool and ChatModel in Flink-Agents. 
These three are different types of Resource. Users can declare these in Agent, 
and use these in Actions.

Among them, ChatModel is the core concept, while Prompt and Tool are derivative 
concepts centered around ChatModel. Since the introduction of ChatModel depends 
on Prompt and Tool, we will first introduce Prompt and Tool in section 2 and 3, 
and then ChatModel in section 4. In section 5, we will explain to developer how 
to use Prompt and Tool in ChatModel. In section 6, we will show how to 
integrate and use Prompt, Tool and ChatModel within an Agent.

## 2. Prompt

Prompt is the input when call ChatModel. Depending on the interface of llm 
(chat/complete), Prompt will be convert to text or chat messages.

#### 2.1 How to create Prompt

User can create Prompt from text or chat messages.

*   Create from text
    

```python
prompt = Prompt.from_text("This is a prompt.")
```

*   Create from chat messages. This is more complicated than creating from 
text, but can provide more information to llm. (The implementation of 
ChatMessage will be introduced in section 4.)
    

```python
prompt = Prompt.from_messages([ChatMessage(role=MessageRole.USER,
                        content="This is a prompt")])
```

#### 2.2 How to use Prompt

Because of the interface provided by different language models accept different 
type of input, like chat() accepts chat messages, or complete() accepts text. 
The prompt can be convert to both text and chat messages.

*   Convert to text string
    

```python
prompt.format_string() 
# "This is a prompt."
```

*   Convert to chat messages
    

```python
prompt.format_messages() 
# "[ChatMessage(role=MessageRole.USER, content="This is a prompt")]"
```

#### 2.3 Parametrization

Under normal circumstances, the content of Prompt needs to be generated 
according to the actual input. Thus, Prompt needs to provide parametric 
content, and use input to fill the correspond parameters in runtime.

*   Create parametric Prompt
    

```python
prompt = Prompt.from_text("This is a prompt about {topic}.")
```

*   Use parametric Prompt
    

```python
prompt.format_string(topic="Animal")
# This is a prompt about animal
```

from\_messages() and format\_messages() can be used in the same way.

#### 2.4 The interface of Prompt

```python
class Prompt(SerializableResource):
    """Prompt for generating prompt of ChatModel according to input and 
template."""

    template: Sequence[ChatMessage]
    
    @staticmethod
    def from_messages(messages: Sequence[ChatMessage]) -> "ChatPromptTemplate":
        """Create ChatPromptTemplate from sequence of ChatMessage."""
    
    @staticmethod
    def from_text(text: str, role: MessageRole=MessageRole.USER) -> 
"ChatPromptTemplate":
       """Create ChatPromptTemplate from text."""
    
    def format_string(self, **kwargs: Dict=None) -> str:
        """Return prompt as text."""
    
    def format_messages(self, **kwargs: Dict=None) -> List[ChatMessage]:
        """Return prompt as messages."""
```

*   Currently, we only considers string Prompt. For multimodal model or other 
models that require non-text input, Prompt will be expanded later.
    

## 3. Tool

#### 3.1 Definition 

Tools are utilities designed to be called by a ChatModel and perform a specific 
task.

There are many types of Tool, and llm needs to understand the capability and 
required arguments of Tool, so Tool needs to contain

*   type: ToolType, the type of the tool
    
*   metadata: ToolMetadata, the name, description and required arguments of Tool
    

and provide call method

*   call: execute Tool with given arguments
    

```python
class BaseTool(SerializableResource, ABC):
    """Base abstract class of all kinds of tools.

    Attributes:
    ----------
    metadata : ToolMetadata
        The metadata of the tool, includes name, description and arguments 
schema.
    """
  
    metadata: ToolMetadata

    @classmethod
    @override
    def resource_type(cls) -> ResourceType:
        """Return resource type of class."""
        return ResourceType.TOOL
      
    @classmethod
    @abstractmethod
    def tool_type(cls) -> ToolType:
        """Return tool type of tool class."""

    @abstractmethod
    def call(self, *args: typing.Tuple[Any, ...], **kwargs: typing.Dict[str, 
Any]) -> Any:
        """Call the tool with arguments.

        This is the method that should be implemented by the tools' developer.
        """
```

#### 3.2 ToolType

The ToolType that Flink-Agents will support includes:

*   model\_built\_in: The tool from the model provider, like 
"web\_search\_preview' of OpenAI models.
    
*   function: The python/java function defined by user.
    
*   remote\_function: The remote function indicated by name.
    
*   mcp: The tool provided by MCP server.
    

But currently, we will only support function tool.

#### 3.3 ToolMetadata

ToolMetadata is the meta of a Tool, includes name, description and required 
arguments. The purpose of ToolMetadata is making llm understand the capability 
and argument schema of the Tool for generating appropriate tool calls.

ToolMetadata includes

*   name:str, the name of the tool
    
*   description:str, the description of the tool, tells what the tool does
    
*   args\_schema: The schema of the arguments the Tool required.
    

```python
class ToolMetadata(BaseModel):
    """Metadata of a tool which describes what the tool does and
     how to call the tool."""
    name: str
    description: str
    args_schema: Type[BaseModel]
```

##### 3.3.1 args\_schema

The args\_schema in the ToolMetadata describes the arguments information 
required by the Tool. The type of args\_schema is Type\[BaseModel\] for using 
the schema validation capability of BaseModel.

Here we give an example.

```python
def foo(bar: int, baz: str) -> str:
    """Function for testing ToolMetadata.

    Parameters
    ----------
    bar : int
        The bar value.
    baz : str
        The baz value.

    Returns:
    -------
    str
        Response string value.
    """
    raise NotImplementedError

args_schema = create_schema_from_function(name="foo", func=foo)
print(args_schema.model_json_schema())

{
    "properties": {
        "bar": {
            "description": "The bar value.",
            "title": "Bar",
            "type": "integer"
        },
        "baz": {
            "description": "The baz value.",
            "title": "Baz",
            "type": "string"
        }
    },
    "required": [
        "bar",
        "baz"
    ],
    "title": "foo",
    "type": "object"
}
```

## 4. ChatModel

#### 4.1 Definition

ChatModel is the core role in Agent, which providing analysis, reasoning, 
decision-making and other capabilities, as well as interacting with the outside 
world with the help of Tool.

ChatModel provides tow basic methods

*   chat: using llm to generate output according to the input. The input and 
output are both ChatMessage.
    
*   bind\_tools: bind tools to llm
    

```python
class BaseChatModel(Resource, ABC):
    """Base abstract class for chat models.

    Attributes:
    ----------
    prompt : Optional[Prompt] = None
        Used for generating prompt according to input.
    """

    prompt: Optional[Prompt] = None

    @abstractmethod
    def chat(
        self,
        messages: Sequence[ChatMessage]
    ) -> ChatMessage:
        """Process a sequence of messages, and return a response.

        Parameters
        ----------
        messages : Sequence[ChatMessage]
            Sequence of chat messages.

        Returns:
        -------
        ChatMessage
            Response from the ChatModel.
        """

    @abstractmethod
    def bind_tools(self, tools: Sequence[BaseTool]) -> None:
        """Bind tools to the chat model

        Parameters
        ----------
        tools : Sequence[BaseTool]
            Sequence of tools to bind to the chat model.
        """

```

*   Referring to other popular Agent frameworks like llama\_index and 
langChain, ChatModel may need provide the following capabilities, but this is 
not considered for the time being.
    
    *   stream chat: stream display llm output 
        
    *   async chat: asynchronous execution
        
    *   batch: accumulate request to batch and execute it in multiple threads
        
    *   rate\_limiter: limit the request frequency
        
    *   format\_output:format output to target schema(OpenAI tool 
schema、Json、TypedDict、Pydantic .etc)
        

#### 4.2 ChatMessage

ChatMessage is the input and output of ChatModel, includes

*   role: MessageRole, indicating the role of ChatMessage, including 
    
    *   system:Used to tell the chat model how to behave and provide additional 
context
        
    *   user:Represents input from a user interacting with the model
        
    *   assistant:Represents a response from the model, which can include text 
or a request to invoke tools
        
    *   tool:A message used to pass the results of a tool invocation back to 
the model.
        
*   content: str, the content of the message
    
*   tool\_calls: List\[Dict\[str, Any\]\], the information of tool calls
    
*   extra\_args: Dict\[str, Any\], some additional information depending on the 
implementation of the ChatModel
    

##### 4.2.1 tool\_calls

tool\_calls describes the tool call information, includes tool type, tool name 
and input arguments.

Here we give an example of ChatMessage has tool\_calls.

```json
{
    "role": "assistant",
    "content": "xxx",
    "tool_calls": [
            {
                "id": "44d758b1-6dc1-49c9-9d7e-f059c80d6e0c",
                "type": "function",
                "function": {
                    "name": "Add",
                    "arguments": {
                        "a": 1,
                        "b": 2
                    }
                }
            }
        ]
    "extra_args": {}
}
```

##### 4.2.2 extra\_args

extra\_args is used to store additional information in key-value form. These 
additional information may be arguments for some specific ChatModel. 

For example, when chat with tongyi , user can enable the large model to 
continue generating content from the initial text user provide by set "partial" 
True.

```json
{
    "role": "assistant",
    "content": "Spring has arrived, and the earth",
    "tool_calls": []
    "extra_args": {"partial": True}
}
```

## 5. Use Prompt and Tool in ChatModel

In this section, to explain how to use Prompt and Tool in ChatModel, we will 
show a built-in ChatModel implementation. 

These work is not need for most users, they can use the built-in ChatModel 
directly. But for users who want to use llm which Flink-Agents has not 
integrate or want to customize the ChatModel behavior, they can define their 
own ChatModel in the same way.

#### 5.1 Extends BaseChatModel

Take the built-in OllamaChatModel as an example. This class extends 
BaseChatModel, and has following fields

*   prompt: Optional\[Prompt\]=None, inherits from BaseChatModel
    
*   host: str, the address of ollama server
    
*   model: str, the name of the llm
    
*   \_\_client: Client, the client to access ollama server
    
*   \_\_tools: Optional\[Sequence\[Mapping\[str, Any\]\]\] = None, information 
of tools this ChatModel can use
    

```python
class OllamaChatModel(BaseChatModel):
    """Built-in ollma chat model."""
    host: str
    model: str
    __client: Client
    __tools: Optional[Sequence[Mapping[str, Any]]] = None
    def __init__(self, **data: Any) -> None:
        super().__init__(**data)
        self.__client = Client(self.host)
```

#### 5.2 Implement chat method

ChatModel should implement the abstract method chat of BaseChatModel. This 
method usually should complete the following work

*   \[Optional\] generate prompt according to input ChatMessage sequence and 
Prompt.
    
*   convert input ChatMessage sequence to the input of llm.
    
*   invoke the interface of llm to generate output
    
*   convert the llm output to output ChatMessage
    

```python
@override
def chat(
            self,
            messages: Sequence[ChatMessage],
            chat_history: Optional[List[ChatMessage]] = None,
    ) -> ChatMessage:
        prompt = messages
        # apply prompt
        if self.prompt is not None:
            input_variable = {}
            for msg in messages:
                input_variable.update(msg.additional_kwargs)
            prompt = self.prompt.format_messages(**input_variable)

        # convert input messages to client input
        input = self.__convert_messages_to_input(prompt)
        response = self.__client.chat.completions.create(model=self.model,
                                      messages=input,
                                      tools=self.__tools)

        # convert response to output message
        return self.__convert_response_to_message(response)
```

#### 5.3 Implement bind\_tools method

For ChatModel that supports tool calling, it also need implement bind\_tool 
method. This method usually convert the metadata of tools to the format the llm 
can understand.

```python
@override
def bind_tools(self, tools: Sequence[BaseTool]) -> None:
    # convert flink-agents tool meta to ollama tool meta
    self.__tools = self.__convert_tools(tools)
```

## 6. Use Prompt, Tool and ChatModel in Agent

Uses can use decorator @prompt, @tool, @chat\_model to declare Prompt, Tool and 
ChatModel to be used in the Agent.

#### 6.1 Declare Prompt

```python
class MyAgent(Agent):
    @prompt
    @staticmethod
    def converter():
        prompt = Prompt.from_text("The product is {name}, {description}. There 
are the evaluate details, each item"
                                  "contains a rating score and rating reasons: 
{detail}")
        return prompt
```

#### 6.2 Declare Tool

*   Describe the capability and arguments in doc to help llm understand the 
tool.
    

```python
class MyAgent(Agent):
    @tool
    @staticmethod
    def add(int: a, int: b) -> int:
        """calculate the sum of a and b

        Parameters
        ----------
        a : int
            The first operand
        b : int
            The second operand

        Returns:
        -------
        int
            The sum of a and b.
        """
        return a + b
```

#### 6.3 Declare ChatModel

User can use @chat\_model to declare a ChatModel in Agent, and returns the 
ChatModel class and initialize arguments. And there are two ways to declare the 
Prompt to be used in initialize arguments.

*   Use Prompt by set Prompt object directly.
    

```python
class MyAgent(Agent):
    @chat_model
    @staticmethod
    def ollama():
        system_message = ChatMessage(role=MessageRole.SYSTEM,
                                     content="The following is the evaluation 
data of a product. "
                                             "Please analyze it based on the 
user rating and user "
                                             "rating content.")
        user_message = ChatMessage(role=MessageRole.USER, content="{context}")
        prompt = Prompt.from_messages([system_message, user_message])
        return OllamaChatModel, {'host': '8.8.8.8',
                                 'model': 'qwen2.5:7b',
                                 'prompt': prompt,
                                 'tools': ['send_email']}
```

*   Indicate by name to use Agent declared Prompt.
    

```python
class MyAgent(Agent):
    @chat_model
    @staticmethod
    def ollama():
        return OllamaChatModel, {'host': '8.8.8.8',
                                 'model': 'qwen2.5:7b',
                                 'prompt': "converter",
                                 'tools': ['send_email']}
```

#### 6.4 Use Prompt, Tool and ChatModel in Action

*   After declare these resource in Agent, user can get and use them in Action.
    

```python
@action(InputEvent)
@staticmethod
def mock(event: InputEvent, ctx: RunnerContext):
    input_data = event.value
    # get prompt
    converter_prompt = ctx.get_resource("converter", ResourceType.PROMPT)
    # get chat model
    model = ctx.get_resource("ollama", ResourceType.CHAT_MODEL)
    # get tool
    tool = ctx.get_resource("add", ResourceType.TOOL)
```

*   Flink-Agents will also provide some built-in action functions to help user 
process chat request and tool call. 
    

```python
def process_chat_event(event: ChatRequestEvent, ctx: RunnerContext):
    chat_model = ctx.get_resource(event.model, ResourceType.CHAT_MODEL)
    response = chat_model.chat(event.messages)
    # call tool
    if len(response.tool_calls) > 0:
        for tool_call in response.tool_calls:
            ctx.send_event(ToolRequestEvent(name=tool_call.function.name, 
kwargs=tool_call.function.arguments))
    
    # return response
    if len(response.content) > 0:
        ctx.send_event(ChatResponseEvent(request=event, response=response))

def process_tool_request(event: ToolRequestEvent, ctx: RunnerContext):
    tool = ctx.get_resource(event.name, ResourceType.TOOL)
    response = tool.call(**event.kwargs)
    ctx.send_event(ToolResponseEvent(request=event, response=response))
```

GitHub link: https://github.com/apache/flink-agents/discussions/75

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to