adriangb opened a new issue, #705: URL: https://github.com/apache/arrow-rs-object-store/issues/705
## Motivation `object_store` already has the seams for HTTP wrapping — `HttpService`, `HttpConnector`, `HttpBuilder::with_http_connector` — but every consumer who wants distributed tracing, header injection, custom retry/timeout policy, or a non-`reqwest` transport ends up writing the same boilerplate. Concrete pain points pulled from one downstream (a DataFusion-backed query engine running against both cloud durable storage and an HTTP-based local cache): 1. **No support for `reqwest_middleware::ClientWithMiddleware`.** `HttpService` is implemented for `reqwest::Client` but not its middleware-aware sibling. Anyone who wants to plug in `reqwest-tracing` (or any other `reqwest_middleware::Middleware`) has to write a 30-line adapter that mirrors the existing reqwest impl. 2. **`HttpService` is bespoke, not `tower::Service`.** Cannot directly use `tower_http::trace::TraceLayer`, `tower::timeout`, `tower::retry`, OpenTelemetry tower layers, or anything else from the tower ecosystem. 3. **Operation context is gone by the time `HttpService::call` runs.** A wrapper sees `GET https://.../{bucket}/{key}` with a `Range:` header. It doesn't know whether the originating `ObjectStore` call was `get_opts` vs `head` vs `put`, the user's `Path`, or which backend (S3 vs HTTP) issued the request. Distinguishing reads from head probes for span attributes has to be done by sniffing method + headers + URL shape — brittle and backend-specific. 4. **No built-in W3C trace-context injection.** Every consumer doing distributed tracing reimplements `opentelemetry_http::HeaderInjector` plumbing. (This becomes one existing middleware/layer once any of (1)/(2) lands.) ## Proposal at a glance Three independent additions, each behind its own opt-in feature, no breaking change to existing API: - **#702** — `impl HttpService for reqwest_middleware::ClientWithMiddleware` (high-level shortcut, ~30 lines mirroring the existing reqwest impl). Lands the most common use case (`reqwest_tracing::TracingMiddleware`) immediately. - **#704** — `TowerHttpConnector` (general primitive). Adapts any `tower::Service<http::Request<HttpRequestBody>, Response = http::Response<HttpResponseBody>>` into an `HttpConnector`, unlocking `tower-http`, `tower-otel-*`, and non-reqwest transports (`hyper`, `ureq`, `tower::service_fn` mocks, wasm-friendly clients). - **#703** — `ObjectStoreOperation` extension on outbound requests. Each backend inserts an `ObjectStoreOperation { kind, location, backend }` into `http::Request::extensions` so a wrapping `HttpService` (tower layer or reqwest middleware) can produce useful trace spans without sniffing URLs and headers. `http::Extensions` is in-process only, so this is invisible on the wire. ## These PRs are independent The three PRs have no logical dependency on each other and can be reviewed and merged in any order. PR #703 makes the wrappers from #702 and #704 *more useful* (operation context becomes visible) but is not a prerequisite. The only conflicts between the three are trivial textual merges: - All three add a `mod` line to `src/client/http/mod.rs`. - #702 and #704 each add an optional dep + feature stanza to `Cargo.toml`. ## Scope notes - **#702 pinned to `reqwest-middleware = \"0.4\"`** because object_store's `reqwest = \"0.12\"` is incompatible with `reqwest-middleware = \"0.5\"` (which requires `reqwest 0.13`). Bump together when object_store moves to reqwest 0.13. - **#703 instruments only the HTTP (WebDAV) backend** in this PR, to keep review surface tractable. AWS, GCP, and Azure each have their own request-construction style and are best landed as separate follow-up PRs (one per backend) once the mechanism is in place. The `ObjectStoreOperation` types and `HttpRequestBuilder::operation` helper are landed in #703 and ready for those follow-ups to use directly. - **#702 is non-wasm only.** `reqwest-middleware`'s `Middleware` trait requires `Send + Sync` while the wasm `HttpService` plumbing uses `?Send`, and the channel-based wasm body logic in the existing reqwest impl isn't worth duplicating just for this convenience layer. - **Tower error mapping is intentionally simple** in #704: `poll_ready` failures → `Connect`, `call` failures → `Request`. Can iterate later to mirror `HttpError::reqwest`'s richer classification if needed. ## Out of scope for this initiative - Built-in OpenTelemetry exporter integration. Once any of #702/#704 lands, every existing OTel-aware middleware/layer composes; the crate doesn't need a direct `opentelemetry` dependency. - Replacing `RetryConfig`. Retries already happen at a higher layer in `object_store` and that should remain so. - Flipping the `tower` and `reqwest-middleware` features to default-on. That's a follow-up after the three PRs soak. ## Status All three drafts are CI-green at the time of this filing. 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
