aminghadersohi opened a new issue, #37971:
URL: https://github.com/apache/superset/issues/37971

   ### Motivation
   
   Apache Superset needs a robust, user-managed API key system for programmatic 
access. Current authentication options are limited:
   
   1. **Flask Session Cookies** - Require browser-based login, unsuitable for 
programmatic access
   2. **FAB API Tokens (JWT)** - Short-lived tokens requiring frequent refresh, 
not designed for long-running integrations
   
   With the emergence of MCP (Model Context Protocol) and AI-powered 
integrations, there's increasing demand for secure, long-lived API credentials 
that users can manage themselves. API keys provide a familiar pattern (similar 
to GitHub Personal Access Tokens, AWS API Keys, etc.) that enables:
   
   - **MCP Service Integration**: AI assistants connecting to Superset via MCP 
servers
   - **CI/CD Pipelines**: Automated dashboard deployments and testing
   - **External Applications**: Third-party tools querying Superset APIs
   - **Scheduled Jobs**: Cron jobs or workflow orchestrators interacting with 
Superset
   
   ### Proposed Change
   
   #### Architecture: Implement in Flask-AppBuilder
   
   Based on community feedback, API key authentication will be implemented at 
the **Flask-AppBuilder (FAB) layer** rather than as a Superset-specific 
feature. This ensures:
   
   1. **Unified auth management** - No split between FAB and Superset auth 
systems
   2. **Automatic decorator compatibility** - `@protect()`, `@safe`, 
`@has_access` work out of the box
   3. **MCP compatibility** - `SupersetSecurityManager` extends FAB's 
`SecurityManager`, so MCP inherits validation automatically
   4. **One mechanism** - New API keys will be the single long-lived 
programmatic access mechanism, eventually replacing FAB's short-lived JWT tokens
   5. **Broader ecosystem benefit** - Any FAB-based project can use API keys, 
not just Superset
   
   | Concern | FAB Implementation | Superset Implementation |
   |---|---|---|
   | `@protect()` / `@has_access` | Works automatically | Requires custom 
middleware |
   | User entity ownership | FAB owns `ab_user` | Would duplicate auth 
management |
   | MCP service | Inherits via `SupersetSecurityManager` | Requires separate 
integration |
   | Other FAB projects | All benefit | Superset-only |
   | Maintenance | Single auth system | Split auth management |
   
   #### Overview
   
   Introduce a new `ApiKey` model in FAB and associated infrastructure that 
allows users to:
   
   1. Create API keys with optional expiration dates and optional scopes
   2. View and manage their active keys
   3. Revoke keys when no longer needed
   4. Use keys for Bearer token authentication on all API endpoints (REST API 
and MCP)
   
   #### Authentication Flow
   
   ```
   ┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
   │  Client/MCP     │     │    Superset     │     │   FAB Security  │
   │  Application    │     │    API / MCP    │     │   Manager       │
   └────────┬────────┘     └────────┬────────┘     └────────┬────────┘
            │                       │                       │
            │  Authorization:       │                       │
            │  Bearer sst_xxx...    │                       │
            ├──────────────────────►│                       │
            │                       │                       │
            │                       │  Validate API Key     │
            │                       ├──────────────────────►│
            │                       │                       │
            │                       │  Return User Context  │
            │                       │◄──────────────────────┤
            │                       │                       │
            │  API Response         │                       │
            │◄──────────────────────┤                       │
            │                       │                       │
   ```
   
   #### Key Format
   
   API keys follow a structured format for easy identification:
   
   ```
   sst_<base64_encoded_random_bytes>
   ```
   
   - `sst_` prefix identifies Superset API keys (similar to `ghp_` for GitHub, 
`sk_` for Stripe)
   - 32 bytes of cryptographically secure random data, base64 encoded
   - Only the hash is stored in the database; the plaintext key is shown once 
at creation
   
   #### Database Schema
   
   A new `api_key` table in FAB:
   
   ```sql
   CREATE TABLE api_key (
       id INTEGER PRIMARY KEY,
       uuid UUID NOT NULL UNIQUE,
       name VARCHAR(255) NOT NULL,
       key_hash VARCHAR(255) NOT NULL,
       key_prefix VARCHAR(10) NOT NULL,         -- First chars for 
identification
       user_id INTEGER NOT NULL REFERENCES ab_user(id),
       created_on TIMESTAMP NOT NULL,
       created_by_fk INTEGER REFERENCES ab_user(id),
       changed_on TIMESTAMP,
       changed_by_fk INTEGER REFERENCES ab_user(id),
       expires_on TIMESTAMP,                    -- NULL = never expires
       revoked_on TIMESTAMP,                    -- NULL = active
       last_used_on TIMESTAMP,
       scopes JSON,                             -- Included from day 1 for 
forward-compatibility
       UNIQUE(key_hash)
   );
   
   CREATE INDEX idx_api_key_user ON api_key(user_id);
   CREATE INDEX idx_api_key_hash ON api_key(key_hash);
   ```
   
   The `scopes` JSON column is included from the start. In Phase 1, NULL/empty 
scopes means "full user permissions." Phase 2 adds validation logic without 
requiring a migration.
   
   #### API Endpoints
   
   New endpoints under `/api/v1/security/api_keys/`:
   
   | Method | Endpoint | Description |
   |---|---|---|
   | GET | `/api/v1/security/api_keys/` | List current user's API keys |
   | POST | `/api/v1/security/api_keys/` | Create a new API key |
   | GET | `/api/v1/security/api_keys/{id}` | Get API key details |
   | DELETE | `/api/v1/security/api_keys/{id}` | Revoke an API key |
   
   #### Create API Key Request
   
   ```json
   {
     "name": "MCP Integration Key",
     "expires_on": "2025-12-31T23:59:59Z",
     "scopes": ["dashboards:read", "charts:read"]
   }
   ```
   
   `scopes` is optional. In Phase 1, this parameter is accepted but not 
enforced (all scopes are granted). In Phase 2, scopes will be validated and 
enforced.
   
   #### Create API Key Response (shown only once)
   
   ```json
   {
     "id": 1,
     "uuid": "a1b2c3d4-5678-90ab-cdef-1234567890ab",
     "name": "MCP Integration Key",
     "key": "sst_dGhpcyBpcyBhIHRlc3Qga2V5...",
     "key_prefix": "sst_dGhp",
     "created_on": "2025-01-15T10:30:00Z",
     "expires_on": "2025-12-31T23:59:59Z",
     "scopes": ["dashboards:read", "charts:read"]
   }
   ```
   
   #### FAB Security Manager Integration
   
   API key validation will be implemented in FAB's `SecurityManager`, which 
`SupersetSecurityManager` extends:
   
   ```python
   # In FAB SecurityManager
   class SecurityManager:
       def validate_api_key(self, api_key: str) -> User | None:
           """Authenticate user via API key."""
           if not api_key.startswith("sst_"):
               return None
   
           key_hash = self.hash_api_key(api_key)
           api_key_obj = self.get_session.query(ApiKey).filter_by(
               key_hash=key_hash
           ).one_or_none()
   
           if api_key_obj and api_key_obj.is_valid:
               api_key_obj.last_used_on = datetime.utcnow()
               return api_key_obj.user
           return None
   ```
   
   FAB's `@protect()` and `@has_access` decorators will be updated to check for 
`Authorization: Bearer sst_...` headers, ensuring API keys work across all 
existing endpoints without changes.
   
   #### MCP Service Integration
   
   Since `SupersetSecurityManager` extends FAB's `SecurityManager`, MCP gets 
API key support automatically:
   
   ```python
   # superset/mcp_service/auth.py
   # Minimal change - just call the inherited method
   user = security_manager.validate_api_key(bearer_token)
   if user:
       g.user = user
   ```
   
   #### SECRET_KEY Rotation Support
   
   API key validation supports `SECRET_KEY` rotation via `PREVIOUS_SECRET_KEY`. 
During rotation:
   
   1. Validate against current `SECRET_KEY` first
   2. Fall back to `PREVIOUS_SECRET_KEY` if validation fails
   3. This matches the existing pattern used by flask-jwt-extended
   
   Since API keys are validated by hash lookup (not JWT signature 
verification), `SECRET_KEY` rotation is less critical for API keys than for 
JWTs. However, if any signing is involved (e.g., for the key prefix or audit), 
rotation support is included.
   
   #### Feature Flag & Configuration
   
   Development phase:
   
   ```python
   FEATURE_FLAGS = {
       "ENABLE_API_KEYS": False,  # Default disabled during development
   }
   ```
   
   Once stable, promoted to a top-level configuration enabled by default:
   
   ```python
   # superset_config.py
   API_KEY_AUTH_ENABLED = True  # Top-level config, enabled by default
   
   # Hashing (using werkzeug for algorithm flexibility)
   API_KEY_HASH_METHOD = "pbkdf2:sha256"
   API_KEY_HASH_SALT_LENGTH = 16
   
   # Limits
   API_KEY_MAX_PER_USER = 10  # 0 = unlimited
   
   # Expiration
   API_KEY_DEFAULT_EXPIRATION_DAYS = None  # None = never expires
   API_KEY_REQUIRE_EXPIRATION = False
   
   # Cleanup
   API_KEY_CLEANUP_RETENTION_DAYS = 90  # Days to retain revoked/expired keys
   ```
   
   #### Cleanup of Revoked/Expired Keys
   
   Revoked and expired keys are automatically cleaned up following the same 
pattern as the `logs` table cleanup:
   
   - A periodic task deletes revoked/expired keys older than 
`API_KEY_CLEANUP_RETENTION_DAYS`
   - Default retention: 90 days (configurable)
   - Preserves audit trail for the configured retention period
   - Uses existing Superset task scheduling infrastructure
   
   #### Scoping (Phase 2)
   
   While the initial implementation grants API keys the full permissions of the 
associated user, Phase 2 will support scoped access. The `scopes` JSON column 
is included in the schema from day 1 to avoid a migration.
   
   Proposed scope categories:
   
   | Scope | Description |
   |---|---|
   | `dashboards:read` | View dashboards |
   | `dashboards:write` | Create/modify dashboards |
   | `charts:read` | View charts |
   | `charts:write` | Create/modify charts |
   | `datasets:read` | View datasets |
   | `datasets:write` | Create/modify datasets |
   | `sql_lab:read` | Execute SELECT queries |
   | `sql_lab:write` | Execute all SQL (including DDL/DML) |
   
   When scoping is implemented, the security manager will intersect the user's 
RBAC permissions with the key's scopes to determine effective permissions. This 
aligns with [SIP-131](https://github.com/apache/superset/issues/28377) security 
model revamp.
   
   #### Phased Implementation
   
   - **Phase 1**: FAB changes (ApiKey model, validation, `@protect` 
integration) + Superset API key CRUD endpoints + UI + MCP integration. Feature 
flag defaulting to false.
   - **Phase 2**: Scoping support (after SIP-131 foundations). Scopes column is 
already present from Phase 1 — only validation logic needs to be added.
   - **Phase 3**: Admin management of other users' API keys, audit features, 
potential deprecation of FAB short-lived JWT tokens.
   
   The schema is designed so that Phase 2 scoping requires no migration — the 
`scopes` JSON column exists from day 1.
   
   ### New or Changed Public Interfaces
   
   #### REST API
   
   - New `/api/v1/security/api_keys/` endpoint family (implemented in FAB)
   - All existing REST API endpoints accept `Authorization: Bearer sst_...` 
header
   - MCP endpoints accept `Authorization: Bearer sst_...` header
   
   #### UI
   
   New "API Keys" section in user settings page allowing users to:
   
   - View list of their API keys (name, prefix, created date, last used, 
expiration)
   - Create new API keys (with optional name, expiration, and scopes)
   - Revoke existing keys
   
   #### FAB
   
   - New `ApiKey` model
   - `SecurityManager.validate_api_key()` method
   - `@protect()` decorator updated to check Bearer API keys
   - New migration for `api_key` table
   
   #### Configuration
   
   ```python
   # superset_config.py
   FEATURE_FLAGS = {
       "ENABLE_API_KEYS": False,  # Phase 1: feature flag
   }
   
   # Phase 2+: top-level config
   API_KEY_AUTH_ENABLED = True
   API_KEY_HASH_METHOD = "pbkdf2:sha256"
   API_KEY_HASH_SALT_LENGTH = 16
   API_KEY_MAX_PER_USER = 10
   API_KEY_DEFAULT_EXPIRATION_DAYS = None
   API_KEY_REQUIRE_EXPIRATION = False
   API_KEY_CLEANUP_RETENTION_DAYS = 90
   ```
   
   ### New Dependencies
   
   - `werkzeug` (for `generate_password_hash` / `check_password_hash`) — 
already a transitive dependency via Flask
   - No new external dependencies required
   
   ### Migration Plan and Compatibility
   
   #### FAB Changes (PR to Flask-AppBuilder)
   
   1. Add `ApiKey` model
   2. Add `validate_api_key()` to `SecurityManager`
   3. Update `@protect()` to check API key Bearer tokens
   4. Add migration for `api_key` table
   5. Add API endpoints for key CRUD
   
   #### Superset Changes
   
   1. Enable API key feature via config
   2. Add API Keys UI in user settings
   3. Wire MCP auth to use `security_manager.validate_api_key()`
   4. Add cleanup task following `logs` pattern
   
   #### Backward Compatibility
   
   - **FAB JWT tokens**: Continue to work unchanged
   - **Session cookies**: Continue to work unchanged
   - **Existing integrations**: No changes required
   
   API keys are additive and don't affect existing authentication mechanisms. 
Long-term, once API keys are stable, FAB's short-lived JWT tokens could be 
deprecated in favor of API keys for programmatic access.
   
   #### Rollback Plan
   
   1. Set `API_KEY_AUTH_ENABLED = False` (or `FEATURE_FLAGS["ENABLE_API_KEYS"] 
= False`) to disable the feature
   2. Run downgrade migration to remove the `api_key` table
   
   ### Rejected Alternatives
   
   #### 1. Implement at Superset Layer Only
   
   **Why rejected**: Would create split auth management between FAB and 
Superset. FAB's `@protect()` and `@has_access` decorators wouldn't be aware of 
API keys, requiring custom middleware. Implementing in FAB ensures unified auth 
management.
   
   #### 2. Long-lived JWTs via flask-jwt-extended
   
   **Why considered**: Simpler — no new tables, no new auth system. 
Configurable expiration, revocation via blocklist.
   
   **Why not chosen as primary approach**: Doesn't provide the UX that users 
expect (named keys, per-key last_used tracking, prefix-based leak detection). 
However, the FAB implementation can optionally use JWT-style tokens internally 
if that simplifies the auth flow.
   
   #### 3. OAuth2 Client Credentials Flow
   
   **Why rejected**: Requires additional OAuth2 server infrastructure. Overkill 
for simple programmatic access. Could be added later for enterprise deployments.
   
   #### 4. Store Keys Without Hashing
   
   **Why rejected**: Security risk. If the database is compromised, all API 
keys would be exposed.
   
   #### 5. Use UUIDs as Primary Keys
   
   **Why considered**: Project guidelines recommend UUIDs for new models.
   
   **Compromise**: Use integer primary key (for FAB compatibility and 
performance) with an additional `uuid` column for external reference. External 
APIs will use UUIDs, not integer IDs.
   
   ### Security Considerations
   
   1. **Key Storage**: Only hashed keys stored; plaintext shown once at creation
   2. **Key Rotation**: Users can revoke and recreate keys at any time. 
SECRET_KEY rotation supported via PREVIOUS_SECRET_KEY.
   3. **Audit Trail**: `created_on`, `last_used_on`, `revoked_on` timestamps 
for tracking
   4. **Rate Limiting**: Subject to existing Superset rate limiting
   5. **Expiration**: Optional expiration dates for time-limited access
   6. **Prefix Identification**: `sst_` prefix allows security tools to 
identify leaked keys
   7. **Automatic Cleanup**: Revoked/expired keys cleaned up after configurable 
retention period (default 90 days), following the `logs` table cleanup pattern
   
   ### References
   
   - [SIP-131: Security Model 
Revamp](https://github.com/apache/superset/issues/28377)
   - [GitHub Personal Access 
Tokens](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens)
   - [Stripe API Keys](https://stripe.com/docs/keys)
   - [flask-jwt-extended Blocklist & Token 
Revoking](https://flask-jwt-extended.readthedocs.io/en/stable/blocklist_and_token_revoking.html)
   - [PR #36173: Initial 
Implementation](https://github.com/apache/superset/pull/36173)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to