On Tue, Sep 10, 2024 at 6:26 PM Jim Newsome <jnews...@torproject.org> wrote: > > It'd be helpful to have more context about the object IDs and what we're > trying to accomplish with them here; why we need/want them in arti but > didn't in c-tor. I'm inferring (maybe incorrectly) that the idea is that > this is effectively letting us multiplex differently-configured > SOCKS->Tor services on a single port. And/or maybe to multiplex multiple > data connections over a single SOCKS socket? Is it worth doing these vs > the alternatives (a listening port per service/object and a socket per > data stream)? e.g. is this fixing some current resource exhaustion > issue, or one we expect to be more problematic in arti...? > > Maybe worth mentioning the length limit for user and password (255 I > believe) and that it'll be sufficient (?) > > Otherwise LGTM
This is a good question! Right now there isn't a complete spec for arti RPC, but for background you could have a look at the file `rpc-meta-draft.md` ( https://gitlab.torproject.org/tpo/core/arti/-/blob/main/doc/dev/notes/rpc-meta-draft.md ) in arti, as amended by the WIP branch at https://gitlab.torproject.org/tpo/core/arti/-/merge_requests/2386 . This doesn't (yet) describe the DataStream protocol, since that's what we're trying to hammer out here. There's a comment in arti::socks about that which I hope to migrate to rpc-meta-draft once it is accurate. I'll copy out the relevant parts below, since many of them are about to be overwritten by this proposal. Sorry about so many incomplete documents! I hope that this will help answer the questions. If not, please just poke me again. /// ## Key concepts /// /// A data stream is "RPC-visible" if, when it is created via SOCKS, /// the RPC system is told about it. /// /// Every RPC-visible stream is associated with a given RPC object when it is created. /// (Since the RPC object is being specified in the SOCKS protocol, /// it must be one with an externally visible Object ID. /// Such Object IDs are cryptographically unguessable and unforgeable, /// and are qualified with a unique identifier for their associated RPC session.) /// Call this RPC Object the "target" object for now. /// This target RPC object must implement /// the [`ConnectWithPrefs`](arti_client::rpc::ConnectWithPrefs) special method. /// /// Right now, there are two general kinds of objects that implement this method: /// client-like objects, and stream-like objects. /// /// A client-like object is either a `TorClient` or an RPC `Session`. /// It knows about and it is capable of opening multiple data streams. /// Using it as the target object for a SOCKS connection tells Arti /// that the resulting data stream (if any) /// should be built by it, and associated with its RPC session. /// /// An application gets a TorClient by asking the session for one, /// or for asking a TorClient to give you a new variant clone of itself. /// /// A stream-like object is an `arti_rpcserver::stream::RpcDataStream`. /// It is created from a client-like object, but represents a single data stream. /// When created, it it not yet connected or trying to connect to anywhere: /// the act of using it as the target Object for a SOCKS connection causes /// it to begin connecting. /// (You can also think of this as a single-use client, /// which once used, becomes interchangeable with the DataStream it created.) /// (TODO: We may wish to change this vocabulary. /// We may wish to call this a "stream handle", for instance?) /// /// An application gets an RpcDataStream by calling `arti:new_stream_handle /// on any client-like object. Currently, this always creates an RpcDataStream /// that makes optimistic connections; See #1583. ... /// ## Intended use cases (examples) /// /// (These examples assume that the application /// already knows the SOCKS port it should use. /// I'm leaving out the isolation strings as orthogonal.) /// /// These are **NOT** the only possible use cases; /// they're just the two that help understand this system best (I hope). /// /// ### Case 1: Using a client-like object directly. /// /// Here the application has authenticated to RPC /// and gotten the session ID `SESSION-1`. /// (In reality, this would be a longer ID, and full of crypto). /// /// The application wants to open a new stream to www.example.com. /// They don't particularly care about isolation, /// but they do want their stream to use their RPC session. /// They don't want an Object ID for the stream. /// /// To do this, they make a SOCKS connection to arti, /// with target address www.example.com. /// They set the username to `<arti-rpc-session>`, /// and the password to `SESSION-1`. /// /// Arti looks up the Session object via the `SESSION-1` object ID /// and tells it (via the ConnectWithPrefs special method) /// to connect to www.example.com. /// The session creates a new DataStream using its internal TorClient, /// but does not register the stream with an RPC Object ID. /// Arti proxies the application's SOCKS connection through this DataStream. /// /// /// ### Case 2: Creating an identifiable stream. /// /// Here the application wants to be able to refer to its DataStream /// after the stream is created. /// As before, we assume that it's on an RPC session /// where the Session ID is `SESSION-1`. /// /// The application sends an RPC request of the form: /// `{"id": 123, "obj": "SESSION-1", "method": "arti:new_stream_handle", "params": {}}` /// /// It receives a reply like: /// `{"id": 123, "result": {"id": "STREAM-1"} }` /// /// (In reality, `STREAM-1` would also be longer and full of crypto.) /// /// Now the application has an object called `STREAM-1` that is not yet a connected /// stream, but which may become one. /// /// The application opens a socks connection as before. /// For the username it sends `<arti-rpc-session>`, /// and for the password it sends `STREAM-1`. /// /// Now Arti looks up the `RpcDataStream` object via `STREAM-1`, /// and tells it (via the ConnectWithPrefs special method) /// to connect to www.example.com. /// This causes the `RpcDataStream` internally to create a new `DataStream`, /// and to store that `DataStream` in itself. /// The `RpcDataStream` with Object ID `STREAM-1` /// is now an alias for the newly created `DataStream`. /// Arti proxies the application's SOCKS connection through that `DataStream`. /// _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev