Dev-iL opened a new pull request, #64299:
URL: https://github.com/apache/airflow/pull/64299

   closes: #64258 
   
   ## Summary
   
   - Remove the unmaintained `sshtunnel` dependency (broken on Python 3.14) 
from the SSH provider
   - Replace with paramiko-native sync tunneling (`SSHTunnel`) and 
asyncssh-based async tunneling (`AsyncSSHTunnel`)
   - Add `SSHHookAsync.get_tunnel()` as a new async tunnel capability
   - Provide backward-compatible deprecation path for existing 
`SSHTunnelForwarder` users
   
   ## Motivation
   
   The `sshtunnel` package has not been maintained since 2021 and uses syntax 
that
   causes an import-time `SyntaxError` on Python 3.14. Since `sshtunnel` is
   fundamentally a thin wrapper around paramiko's 
`Transport.open_channel('direct-tcpip', ...)`,
   we can replace it with a direct paramiko implementation that:
   
   1. Fixes Python 3.14 compatibility
   2. Removes a dead dependency
   3. Reuses `SSHHook.get_conn()` for tunnel connections, inheriting all 
auth/proxy configuration
   4. Adds async tunnel support via asyncssh (already a dependency)
   
   ## Design Decisions
   
   ### Reuse `get_conn()` instead of creating a separate SSH connection
   
   **Before:** `SSHTunnelForwarder` established its own SSH connection with 
separately
   assembled credentials (`ssh_username`, `ssh_password`, `ssh_pkey`, 
`ssh_proxy`, etc.),
   duplicating the logic in `get_conn()`.
   
   **After:** `SSHTunnel` receives an already-connected `paramiko.SSHClient` 
from
   `get_conn()`. This means all authentication methods (password, key file, 
private key,
   ECDSA, proxy commands, host key verification) are automatically inherited 
without
   duplication.
   
   **Trade-off:** The tunnel now shares the SSH connection with other 
operations. This is
   acceptable because SSH multiplexes channels over a single transport, and 
this is how
   `ssh -L` works natively.
   
   ### Select-loop in a daemon thread (not thread-per-connection)
   
   The sync `SSHTunnel` uses a single daemon thread running a `select()` loop 
that
   multiplexes all forwarded connections. This avoids spawning a thread per 
connection
   (which `sshtunnel` did internally) and keeps resource usage predictable.
   
   A socket-pair (`socketpair()`) is used as a self-pipe to wake the `select()` 
loop
   cleanly on shutdown, avoiding the need for polling timeouts or signal-based
   approaches.
   
   ### Minimal backward compatibility shim
   
   Rather than fully reimplementing `SSHTunnelForwarder`'s API surface, we 
provide:
   
   - **Context manager** (`__enter__`/`__exit__`) - the recommended interface
   - **`.start()`/`.stop()`** - deprecated, emit `DeprecationWarning`
   - **`.local_bind_port`** and **`.local_bind_address`** - preserved as 
properties
   - **`__getattr__`** - raises `AttributeError` with migration hint for
     `SSHTunnelForwarder`-specific attributes (e.g., `tunnel_is_up`, `ssh_host`)
   
   This covers the known usage patterns without maintaining dead code. No 
external
   providers or common user code accesses `SSHTunnelForwarder`-specific 
attributes
   beyond context manager + `local_bind_port`.
   
   ### AsyncSSHTunnel as a thin asyncssh wrapper
   
   `asyncssh` already provides `forward_local_port()` which handles all the 
forwarding
   internally. `AsyncSSHTunnel` is a thin wrapper that:
   
   - Manages the lifecycle (listener + SSH connection cleanup in `__aexit__`)
   - Exposes `.local_bind_port` via `listener.get_port()`
   - Handles cleanup on `__aenter__` failure (closes SSH connection if 
`forward_local_port` raises)
   - Follows the `async with await hook.get_tunnel(...)` pattern
   
   ### Eager socket binding in constructor
   
   `SSHTunnel.__init__` binds the local socket immediately (before 
`__enter__`), so
   `.local_bind_port` is available right after construction. This matches
   `SSHTunnelForwarder`'s behavior where the port was known before calling 
`.start()`.
   The socket is cleaned up properly even if construction fails (try/except on
   bind/listen).
   
   ## Changes
   
   ### New files
   
   - **`providers/ssh/src/airflow/providers/ssh/tunnel.py`** - `SSHTunnel` 
(sync,
     paramiko) and `AsyncSSHTunnel` (async, asyncssh) classes with full 
docstrings
     and migration guidance
   
   ### Modified files
   
   - **`providers/ssh/src/airflow/providers/ssh/hooks/ssh.py`**
     - `SSHHook.get_tunnel()` now returns `SSHTunnel` via `get_conn()`
     - Removed `from sshtunnel import SSHTunnelForwarder`
     - Removed paramiko logger workaround (sshtunnel-specific hack)
     - Added `SSHHookAsync.get_tunnel()` returning `AsyncSSHTunnel`
   
   - **`providers/ssh/pyproject.toml`** - Removed `sshtunnel>=0.3.2` dependency
   
   - **`providers/ssh/tests/unit/ssh/hooks/test_ssh.py`**
     - Rewrote 5 tunnel unit tests to mock `paramiko.SSHClient` + `SSHTunnel` 
instead
       of `SSHTunnelForwarder`
     - Added 2 new hook-level tests (local_port, remote_host parameters)
     - Added `TestSSHTunnel` class with 7 tests: deprecation warnings, 
`__getattr__`
       migration hints, ephemeral/explicit port binding, context manager 
lifecycle
   
   - **`providers/ssh/tests/unit/ssh/hooks/test_ssh_async.py`**
     - Added 3 async tunnel tests: return type, async context manager with 
cleanup
       verification, explicit local_port
   
   - **`docker-tests/tests/docker_tests/test_prod_image.py`** - Removed 
`sshtunnel`
     from expected package imports
   
   - **`devel-common/src/docs/utils/conf_constants.py`** - Removed `sshtunnel` 
from
     third-party autodoc allowlist
   
   - **`docs/spelling_wordlist.txt`** - Replaced 
`sshtunnel`/`SSHTunnelForwarder` with
     `SSHTunnel`
   
   ## Bugs found and fixed during review
   
   1. **AsyncSSHTunnel resource leak on `__aenter__` failure** (Medium): If
      `forward_local_port()` raised an exception, the SSH connection passed to
      `AsyncSSHTunnel` would never be closed since `__aexit__` is not called 
when
      `__aenter__` fails. Fixed by wrapping `forward_local_port()` in try/except
      that closes the SSH connection before re-raising.
   
   2. **SSHTunnel socket leak on bind failure** (Low): If `bind()` or `listen()`
      raised `OSError` in the constructor, the socket created moments before 
would
      leak. Fixed by wrapping bind/listen in try/except that closes the socket
      before re-raising.
   
   ## Test results
   
   - 105 unit tests pass (all non-integration SSH provider tests)
   - 42 integration tests deselected (require running local SSH server)
   - All ruff format and ruff check pass
   
   
    <!-- SPDX-License-Identifier: Apache-2.0
         https://www.apache.org/licenses/LICENSE-2.0 -->
   
   <!--
   Thank you for contributing!
   
   Please provide above a brief description of the changes made in this pull 
request.
   Write a good git commit message following this guide: 
http://chris.beams.io/posts/git-commit/
   
   Please make sure that your code changes are covered with tests.
   And in case of new features or big changes remember to adjust the 
documentation.
   
   Feel free to ping (in general) for the review if you do not see reaction for 
a few days
   (72 Hours is the minimum reaction time you can expect from volunteers) - we 
sometimes miss notifications.
   
   In case of an existing issue, reference it using one of the following:
   
   * closes: #ISSUE
   * related: #ISSUE
   -->
   
   ---
   
   ##### Was generative AI tooling used to co-author this PR?
   
   <!--
   If generative AI tooling has been used in the process of authoring this PR, 
please
   change below checkbox to `[X]` followed by the name of the tool, uncomment 
the "Generated-by".
   -->
   
   - [x] Yes (please specify the tool below)
   
   Generated-by: Claude Opus 4.6 following [the 
guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions)
   
   ---
   
   * Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)**
 for more information. Note: commit author/co-author name and email in commits 
become permanently public when merged.
   * For fundamental code changes, an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
 is needed.
   * When adding dependency, check compliance with the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   * For significant user-facing changes create newsfragment: 
`{pr_number}.significant.rst`, in 
[airflow-core/newsfragments](https://github.com/apache/airflow/tree/main/airflow-core/newsfragments).
 You can add this file in a follow-up commit after the PR is created so you 
know the PR number.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to