Dev-iL opened a new pull request, #64299:
URL: https://github.com/apache/airflow/pull/64299
closes: #64258
## Summary
- Remove the unmaintained `sshtunnel` dependency (broken on Python 3.14)
from the SSH provider
- Replace with paramiko-native sync tunneling (`SSHTunnel`) and
asyncssh-based async tunneling (`AsyncSSHTunnel`)
- Add `SSHHookAsync.get_tunnel()` as a new async tunnel capability
- Provide backward-compatible deprecation path for existing
`SSHTunnelForwarder` users
## Motivation
The `sshtunnel` package has not been maintained since 2021 and uses syntax
that
causes an import-time `SyntaxError` on Python 3.14. Since `sshtunnel` is
fundamentally a thin wrapper around paramiko's
`Transport.open_channel('direct-tcpip', ...)`,
we can replace it with a direct paramiko implementation that:
1. Fixes Python 3.14 compatibility
2. Removes a dead dependency
3. Reuses `SSHHook.get_conn()` for tunnel connections, inheriting all
auth/proxy configuration
4. Adds async tunnel support via asyncssh (already a dependency)
## Design Decisions
### Reuse `get_conn()` instead of creating a separate SSH connection
**Before:** `SSHTunnelForwarder` established its own SSH connection with
separately
assembled credentials (`ssh_username`, `ssh_password`, `ssh_pkey`,
`ssh_proxy`, etc.),
duplicating the logic in `get_conn()`.
**After:** `SSHTunnel` receives an already-connected `paramiko.SSHClient`
from
`get_conn()`. This means all authentication methods (password, key file,
private key,
ECDSA, proxy commands, host key verification) are automatically inherited
without
duplication.
**Trade-off:** The tunnel now shares the SSH connection with other
operations. This is
acceptable because SSH multiplexes channels over a single transport, and
this is how
`ssh -L` works natively.
### Select-loop in a daemon thread (not thread-per-connection)
The sync `SSHTunnel` uses a single daemon thread running a `select()` loop
that
multiplexes all forwarded connections. This avoids spawning a thread per
connection
(which `sshtunnel` did internally) and keeps resource usage predictable.
A socket-pair (`socketpair()`) is used as a self-pipe to wake the `select()`
loop
cleanly on shutdown, avoiding the need for polling timeouts or signal-based
approaches.
### Minimal backward compatibility shim
Rather than fully reimplementing `SSHTunnelForwarder`'s API surface, we
provide:
- **Context manager** (`__enter__`/`__exit__`) - the recommended interface
- **`.start()`/`.stop()`** - deprecated, emit `DeprecationWarning`
- **`.local_bind_port`** and **`.local_bind_address`** - preserved as
properties
- **`__getattr__`** - raises `AttributeError` with migration hint for
`SSHTunnelForwarder`-specific attributes (e.g., `tunnel_is_up`, `ssh_host`)
This covers the known usage patterns without maintaining dead code. No
external
providers or common user code accesses `SSHTunnelForwarder`-specific
attributes
beyond context manager + `local_bind_port`.
### AsyncSSHTunnel as a thin asyncssh wrapper
`asyncssh` already provides `forward_local_port()` which handles all the
forwarding
internally. `AsyncSSHTunnel` is a thin wrapper that:
- Manages the lifecycle (listener + SSH connection cleanup in `__aexit__`)
- Exposes `.local_bind_port` via `listener.get_port()`
- Handles cleanup on `__aenter__` failure (closes SSH connection if
`forward_local_port` raises)
- Follows the `async with await hook.get_tunnel(...)` pattern
### Eager socket binding in constructor
`SSHTunnel.__init__` binds the local socket immediately (before
`__enter__`), so
`.local_bind_port` is available right after construction. This matches
`SSHTunnelForwarder`'s behavior where the port was known before calling
`.start()`.
The socket is cleaned up properly even if construction fails (try/except on
bind/listen).
## Changes
### New files
- **`providers/ssh/src/airflow/providers/ssh/tunnel.py`** - `SSHTunnel`
(sync,
paramiko) and `AsyncSSHTunnel` (async, asyncssh) classes with full
docstrings
and migration guidance
### Modified files
- **`providers/ssh/src/airflow/providers/ssh/hooks/ssh.py`**
- `SSHHook.get_tunnel()` now returns `SSHTunnel` via `get_conn()`
- Removed `from sshtunnel import SSHTunnelForwarder`
- Removed paramiko logger workaround (sshtunnel-specific hack)
- Added `SSHHookAsync.get_tunnel()` returning `AsyncSSHTunnel`
- **`providers/ssh/pyproject.toml`** - Removed `sshtunnel>=0.3.2` dependency
- **`providers/ssh/tests/unit/ssh/hooks/test_ssh.py`**
- Rewrote 5 tunnel unit tests to mock `paramiko.SSHClient` + `SSHTunnel`
instead
of `SSHTunnelForwarder`
- Added 2 new hook-level tests (local_port, remote_host parameters)
- Added `TestSSHTunnel` class with 7 tests: deprecation warnings,
`__getattr__`
migration hints, ephemeral/explicit port binding, context manager
lifecycle
- **`providers/ssh/tests/unit/ssh/hooks/test_ssh_async.py`**
- Added 3 async tunnel tests: return type, async context manager with
cleanup
verification, explicit local_port
- **`docker-tests/tests/docker_tests/test_prod_image.py`** - Removed
`sshtunnel`
from expected package imports
- **`devel-common/src/docs/utils/conf_constants.py`** - Removed `sshtunnel`
from
third-party autodoc allowlist
- **`docs/spelling_wordlist.txt`** - Replaced
`sshtunnel`/`SSHTunnelForwarder` with
`SSHTunnel`
## Bugs found and fixed during review
1. **AsyncSSHTunnel resource leak on `__aenter__` failure** (Medium): If
`forward_local_port()` raised an exception, the SSH connection passed to
`AsyncSSHTunnel` would never be closed since `__aexit__` is not called
when
`__aenter__` fails. Fixed by wrapping `forward_local_port()` in try/except
that closes the SSH connection before re-raising.
2. **SSHTunnel socket leak on bind failure** (Low): If `bind()` or `listen()`
raised `OSError` in the constructor, the socket created moments before
would
leak. Fixed by wrapping bind/listen in try/except that closes the socket
before re-raising.
## Test results
- 105 unit tests pass (all non-integration SSH provider tests)
- 42 integration tests deselected (require running local SSH server)
- All ruff format and ruff check pass
<!-- SPDX-License-Identifier: Apache-2.0
https://www.apache.org/licenses/LICENSE-2.0 -->
<!--
Thank you for contributing!
Please provide above a brief description of the changes made in this pull
request.
Write a good git commit message following this guide:
http://chris.beams.io/posts/git-commit/
Please make sure that your code changes are covered with tests.
And in case of new features or big changes remember to adjust the
documentation.
Feel free to ping (in general) for the review if you do not see reaction for
a few days
(72 Hours is the minimum reaction time you can expect from volunteers) - we
sometimes miss notifications.
In case of an existing issue, reference it using one of the following:
* closes: #ISSUE
* related: #ISSUE
-->
---
##### Was generative AI tooling used to co-author this PR?
<!--
If generative AI tooling has been used in the process of authoring this PR,
please
change below checkbox to `[X]` followed by the name of the tool, uncomment
the "Generated-by".
-->
- [x] Yes (please specify the tool below)
Generated-by: Claude Opus 4.6 following [the
guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions)
---
* Read the **[Pull Request
Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)**
for more information. Note: commit author/co-author name and email in commits
become permanently public when merged.
* For fundamental code changes, an Airflow Improvement Proposal
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
is needed.
* When adding dependency, check compliance with the [ASF 3rd Party License
Policy](https://www.apache.org/legal/resolved.html#category-x).
* For significant user-facing changes create newsfragment:
`{pr_number}.significant.rst`, in
[airflow-core/newsfragments](https://github.com/apache/airflow/tree/main/airflow-core/newsfragments).
You can add this file in a follow-up commit after the PR is created so you
know the PR number.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]