Hi All,

I wanted to get feedback on something I have been twiddling with. For
context, the API server has to import
every single hook class from all providers just to render connection forms
in the UI. This is because the UI
metadata (what fields to show, labels, validators, etc.) are living in
python functions like `get_connection_form_widgets()`
and `get_ui_field_behaviour()` which are defined on the hook classes.

This means:
- API server startup imports 100+ hook classes it might not actually need
- Slower startup due to heavier memory footprint
- Poor client-server separation (why does the API server need to know about
pyodbc just to show a UI form?)

My proposal

Moving the UI metadata from python code to something static / declarative
like yaml. I want to add this information
in the provider.yaml file that every provider already has. For example -

class PostgresHook(BaseHook):
    @classmethod
    def get_ui_field_behaviour(cls) -> dict[str, Any]:
        return {
            "hidden_fields": [],
            "relabeling": {
                "schema": "Database",
            },
        }

Will become:

connection-types:
  - connection-type: postgres
    hook-class-name: airflow.providers.postgres.hooks.postgres.PostgresHook

    ui-field-behaviour:
      hidden-fields: []
      relabeling:
        schema: "Database"

    conn-fields:
      sslmode:
        type: string
        label: SSL Mode
        enum: ["disable", "prefer", "require"]
        default: "prefer"

      timeout:
        type: integer
        label: Timeout
        range: [1, 300]
        default: 30

The schema will now consist of two new sections:

1. ui-field-behaviour
- Used to customize the standard connection fields (host, port, login, etc.)
- hidden-fields: Hide some fields
- relabeling: Change labels for some fields (like schema -> Database above)
- placeholders: Show hints in the form (port 5432 for example)

2. conn-fields
- Can be used to define custom fields stored in Connection.extra
- You can define inline validators like enum, range, pattern, min-length,
max-length
- Will support the standard wtforms string, integer, boolean, number types

As for why this schema was chosen, check the comparison with alternative in
the PR
desc: https://github.com/apache/airflow/pull/60410


Current Status

I have a POC in: https://github.com/apache/airflow/pull/60410 where I chose
two pilot providers of
varying difficulty: HTTP and SMTP (HTTP is easy, just a vanilla form but
SMTP has some hidden fields).


Benefits this will offer

- Once complete, the API server won't import any hook classes for UI
rendering leading to faster startup
- Provider dependencies don't affect API server
- YAML is easier to read/write than python functions for form metadata

Would love feedback on:
1. Schema design - does it cover your use cases?
2. Any missing field types or validators?

The goal is to get the pilot providers in so we can start migrating
providers incrementally. Old way still
works, so no rush for everyone to migrate at once.

Thoughts?

Thanks & Regards,
Amogh Desai

Reply via email to