Re: Fwd: Adding IPFS Trustless Gateway Protocol Questions
> my understanding was that there were two > main ways to get files over IPFS: one is to get them via HTTP from an IPFS > gateway that knows about IPFS (what curl does now) and the other is to become > a > full-fledged node in the IPFS network and speak the IPFS protocols to the > world. We have multiple HTTP gateway specifications, currently curl use the IPFS path gateway spec. In this spec curl receives the file in their non IPFS form: Hash Validated ┌──┐ │Host → Gateway│→ Curl └──┘ What I'll submit will use the IPFS trustless gateway spec, so curl will receive the IPFS data structure, hash it, validate it and deserialize it: Hash Validated ┌─┐ │Host → Gateway → Curl│ └─┘ > What you describe sounds like a third method, where one may somehow > find a full IPFS node that happens to have your file and talk a subset of the > IPFS protocol to get that file. Is that an accurate assessment? Not exactly, in what I'll propose the server is still acting as a gateway, so if they don't have the file cached they will download it from someone else. I agree it's not a perfect solution because most public servers do not proxy requests (only public gateways do), running private gateways require deploying a new service. And there is also the option to know OOB which server hosts the file you want, but it is less convenient. > If so, is that really a mode that would be used by a significant number of > people? I'm not gonna deny that there is a bubble effect going on but it's pretty popular within my own bubble. Examples of the trustless gateway used in productions between IPFS nodes are: - https://www.pinata.cloud/ (SaaS IPFS) - https://saturn.tech/ (P2P cdn) - https://web3.storage/ (SaaS IPFS) - (more I don't know about on the top of my head) Other IPFS implementations also supports it: - https://github.com/filecoin-project/lassie (go cli, client) - https://github.com/filecoin-project/boost (go daemon, server) - https://github.com/ipfs/kubo (go daemon, server, client not yet) - https://github.com/ipfs/helia (js browser and nodejs, client) - https://github.com/little-bear-labs/ipfs-chromium (cpp blink runtime, client) It is used for two big reasons: 1. IP transit costs are lower over HTTP than the same bytes over QUIC or TCP, or so I have been told. (even if you comprare H3 vs QUIC) I don't know how much of it is that CDNs had decades to optimize their HTTP implementations vs how much that pricing models for CDNs are weird. 2. HTTP is also widely implemented everywhere, making developers' lives easier when implementing IPFS features. > How do you find an appropriate node for each file, for example? The implementations I listed above either query IPNI (federated indexers which map hashes to servers) and/or DHT (Distributed Index which maps hashes to servers), our DHT is based on Kadamelia. Having content routing in curl would allow to skip the gateway and talk to content hosts directly. I think it make more sense to figure out data validation before trying to download content from strangers on the internet. If that is even a thing curl wants to do. -- Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library Etiquette: https://curl.se/mail/etiquette.html
Re: Fwd: Adding IPFS Trustless Gateway Protocol Questions
On Thu, Oct 26, 2023 at 03:42:34PM +0200, Hugo Valtier via curl-library wrote: > Instead of using the Path Gateway it uses the Trustless Gateway which > answers with a stream of blocks and walks the merkle-tree, verifies > hashes and deserializes it on the fly. > This would make curl or libcurl capable of downloading ipfs:// content > from any reachable IPFS node, not just a localhost trusted one. I'm far from an expert in IPFS, but my understanding was that there were two main ways to get files over IPFS: one is to get them via HTTP from an IPFS gateway that knows about IPFS (what curl does now) and the other is to become a full-fledged node in the IPFS network and speak the IPFS protocols to the world. What you describe sounds like a third method, where one may somehow find a full IPFS node that happens to have your file and talk a subset of the IPFS protocol to get that file. Is that an accurate assessment? If so, is that really a mode that would be used by a significant number of people? How do you find an appropriate node for each file, for example? -- Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library Etiquette: https://curl.se/mail/etiquette.html
Re: Fwd: Adding IPFS Trustless Gateway Protocol Questions
> I know very little about IPFS but it certainly sounds like the right way. Good, I'll change my code to be an improvement of the existing ipfs:// feature in the CLI and I think we can continue discussion when I have a pull request. Thx for your time. > Would this not rather *replace* the URL rewrite approach? I mean, isn't verifying the content always better than not verifying? And if not, how would a user know when to use which? Yeah sorry my explanation was confusing. Currently the ipfs:// feature in curl, the IPFS decoding and hash validation happens server side, instead I'll submit some code which does that in curl. So a buggy or badly behaving server would be caught by curl and raise an error. > One of my most important tasks in this project is to resist creeping featurism > - and everyone who has been around here a while know that we are still adding > features and changes at a rather high pace. To make sure that the line in the > sand that is drawn between what is suitable for libcurl and what is NOT > suitable for libcurl remains intact and untouched as much as possible. I think > sticking "core" transport protocols could be considered one of those things > that so far has defined if a protocol is meant for libcurl or not. > > Lines in sand are by definition not distinct or always very clear, but I think we do everyone a service when we resist changing how that line is drawn. > > Of course, sometimes we end up clarifying or changing how we view the world > after discussions and deliberating and then we adapt and move on with an > updated view. But until that moment, we resist it. Fair enough. > I'm sorry but I don't understand the question. If IPFS-over-HTTP is not for > libcurl, then how is Trustless-Gateway-over-HTTP-over-Libp2p any better? I used it to poke holes in: > If you want an IPFS-over-HTTP library you can make one on top of > libcurl. but as you pointed out: > Lines in sand are by definition not distinct or always very clear, but I think > we do everyone a service when we resist changing how that line is drawn. so this is irrelevant. I think the best path forward is for me to send a PR iterating on what is already in the cli. > Are you proposing that libcurl would use another library for IPFS that itself > would have a *separate* HTTP implementation and this, because you're doing > HTTP in a non-standard way? I was saying that in case the suggestion was that by doing IPFS-over-HTTP in curl wouldn't speak real IPFS protocol. This is still experimental. I wasn't planning on this in libcurl any time soon. An hypothetical implementation of this would use curl's http code, but running over custom tcp semantics streams (which are multiplexed and encrypted). This protocol's main goal is to allow us to use existing HTTP clients over our mutually encrypted P2P network, which is really useful in browsers. -- Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library Etiquette: https://curl.se/mail/etiquette.html
Re: Fwd: Adding IPFS Trustless Gateway Protocol Questions
On Thu, 26 Oct 2023, Hugo Valtier via curl-library wrote: This would make curl or libcurl capable of downloading ipfs:// content from any reachable IPFS node, not just a localhost trusted one. If anything, do we agree that doing this is a desirable improvement ? assuming implemented in curl next to the current ipfs:// URL rewriting I know very little about IPFS but it certainly sounds like the right way. Would this not rather *replace* the URL rewrite approach? I mean, isn't verifying the content always better than not verifying? And if not, how would a user know when to use which? For me the main draw of having it in libcurl is that we can provide the same GET semantics to consumers while having self validating incrementally verified requests. Right, for application authors who want to add IPFS support next to some other protocols that libcurl already supports that seems like a decent benefit. I just don't see the user demand for that while at the same time it would add a significant cost and complexity to the library - plus the fact that "hierachically" it is not a clear-cut fit for libcurl. If I make my own C library which consumes libcurl either I have a different API (creates work for consumers) or I make my own fork but I believe me tracking you is work overall than if libcurl would know how to reach out to the IPFS validation code, it's also annoying for distributions. To provide IPFS-over-HTTP as a library, I would probably argue that doing this as a separate library that itself uses libcurl is the way forward. The API could even be designed to work as a libcurl companion perhaps rather than a straight layer on top. I don't understand why precisely drawing the line at over-HTTP, Trustless Gateway is used in production to transfer data between IPFS implementations. One of my most important tasks in this project is to resist creeping featurism - and everyone who has been around here a while know that we are still adding features and changes at a rather high pace. To make sure that the line in the sand that is drawn between what is suitable for libcurl and what is NOT suitable for libcurl remains intact and untouched as much as possible. I think sticking "core" transport protocols could be considered one of those things that so far has defined if a protocol is meant for libcurl or not. Lines in sand are by definition not distinct or always very clear, but I think we do everyone a service when we resist changing how that line is drawn. Of course, sometimes we end up clarifying or changing how we view the world after discussions and deliberating and then we adapt and move on with an updated view. But until that moment, we resist it. Would the new Trustless-Gateway-over-HTTP-over-Libp2p protocol be a suitable candidate for libcurl ? I'm sorry but I don't understand the question. If IPFS-over-HTTP is not for libcurl, then how is Trustless-Gateway-over-HTTP-over-Libp2p any better? And why do HTTP-over-libp2p at all? AFAIT it can't just be done on top since it requires running HTTP over yamux + tls + tcp (or other libp2p transports). Are you proposing that libcurl would use another library for IPFS that itself would have a *separate* HTTP implementation and this, because you're doing HTTP in a non-standard way? That sounds so crazy I must have misunderstood! -- / daniel.haxx.se | Commercial curl support up to 24x7 is available! | Private help, bug fixes, support, ports, new features | https://curl.se/support.html -- Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library Etiquette: https://curl.se/mail/etiquette.html
Fwd: Adding IPFS Trustless Gateway Protocol Questions
> If we are still talking about IPFS-over-HTTP then I believe this situation stands. If you want an IPFS-over-HTTP library you can make one on top of libcurl. > > Am I wrong and if so, how and why? Right now it is indeed "IPFS-over-HTTP" however it is different from the current implementation. Instead of using the Path Gateway it uses the Trustless Gateway which answers with a stream of blocks and walks the merkle-tree, verifies hashes and deserializes it on the fly. This would make curl or libcurl capable of downloading ipfs:// content from any reachable IPFS node, not just a localhost trusted one. If anything, do we agree that doing this is a desirable improvement ? assuming implemented in curl next to the current ipfs:// URL rewriting > A primary reason I insisted on making the current implementation for the curl tool and not for libcurl, is the fact that it is "only" a client on top of > HTTP, and as such libcurl can be used as-is and does not need changing. For me the main draw of having it in libcurl is that we can provide the same GET semantics to consumers while having self validating incrementally verified requests. If I make my own C library which consumes libcurl either I have a different API (creates work for consumers) or I make my own fork but I believe me tracking you is work overall than if libcurl would know how to reach out to the IPFS validation code, it's also annoying for distributions. I don't understand why precisely drawing the line at over-HTTP, Trustless Gateway is used in production to transfer data between IPFS implementations. Would the new Trustless-Gateway-over-HTTP-over-Libp2p protocol be a suitable candidate for libcurl ? AFAIT it can't just be done on top since it requires running HTTP over yamux + tls + tcp (or other libp2p transports). -- Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library Etiquette: https://curl.se/mail/etiquette.html
Re: Adding IPFS Trustless Gateway Protocol Questions
On Thu, 26 Oct 2023, Hugo Valtier via curl-library wrote: Mark recently added ipfs:// support in the curl CLI however sadly it does not perform validation on the data received. I am interested in fixing that as well as moving the support into libcurl. Hello and welcome to the curl project! A primary reason I insisted on making the current implementation for the curl tool and not for libcurl, is the fact that it is "only" a client on top of HTTP, and as such libcurl can be used as-is and does not need changing. If we are still talking about IPFS-over-HTTP then I believe this situation stands. If you want an IPFS-over-HTTP library you can make one on top of libcurl. Am I wrong and if so, how and why? -- / daniel.haxx.se | Commercial curl support up to 24x7 is available! | Private help, bug fixes, support, ports, new features | https://curl.se/support.html -- Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library Etiquette: https://curl.se/mail/etiquette.html