Re: Fwd: Adding IPFS Trustless Gateway Protocol Questions

2023-10-26 Thread Hugo Valtier via curl-library
> my understanding was that there were two
> main ways to get files over IPFS: one is to get them via HTTP from an IPFS
> gateway that knows about IPFS (what curl does now) and the other is to become 
> a
> full-fledged node in the IPFS network and speak the IPFS protocols to the
> world.

We have multiple HTTP gateway specifications, currently curl use the IPFS
path gateway spec. In this spec curl receives the file in their non IPFS form:
 Hash Validated
┌──┐
│Host → Gateway│→ Curl
└──┘
What I'll submit will use the IPFS trustless gateway spec, so curl will receive
the IPFS data structure, hash it, validate it and deserialize it:
Hash Validated
┌─┐
│Host → Gateway → Curl│
└─┘

> What you describe sounds like a third method, where one may somehow
> find a full IPFS node that happens to have your file and talk a subset of the
> IPFS protocol to get that file. Is that an accurate assessment?

Not exactly, in what I'll propose the server is still acting as a gateway,
so if they don't have the file cached they will download it from someone else.
I agree it's not a perfect solution because most public servers do not proxy
requests (only public gateways do), running private gateways require deploying
a new service.
And there is also the option to know OOB which server hosts the file you want,
but it is less convenient.

> If so, is that really a mode that would be used by a significant number of 
> people?

I'm not gonna deny that there is a bubble effect going on but it's
pretty popular
within my own bubble.

Examples of the trustless gateway used in productions between IPFS nodes are:
- https://www.pinata.cloud/ (SaaS IPFS)
- https://saturn.tech/ (P2P cdn)
- https://web3.storage/ (SaaS IPFS)
- (more I don't know about on the top of my head)

Other IPFS implementations also supports it:
- https://github.com/filecoin-project/lassie (go cli, client)
- https://github.com/filecoin-project/boost (go daemon, server)
- https://github.com/ipfs/kubo (go daemon, server, client not yet)
- https://github.com/ipfs/helia (js browser and nodejs, client)
- https://github.com/little-bear-labs/ipfs-chromium (cpp blink runtime, client)

It is used for two big reasons:
1. IP transit costs are lower over HTTP than the same bytes over QUIC
   or TCP, or so I have been told. (even if you comprare H3 vs QUIC)
   I don't know how much of it is that CDNs had decades to optimize their HTTP
   implementations vs how much that pricing models for CDNs are weird.

2. HTTP is also widely implemented everywhere, making developers' lives
   easier when implementing IPFS features.

> How do you find an appropriate node for each file, for example?

The implementations I listed above either query IPNI (federated
indexers which map hashes to servers) and/or DHT (Distributed Index
which maps hashes to servers), our DHT is based on Kadamelia.

Having content routing in curl would allow to skip the gateway and talk
to content hosts directly.
I think it make more sense to figure out data validation before
trying to download content from strangers on the internet.
If that is even a thing curl wants to do.
-- 
Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library
Etiquette:   https://curl.se/mail/etiquette.html


Re: Fwd: Adding IPFS Trustless Gateway Protocol Questions

2023-10-26 Thread Dan Fandrich via curl-library
On Thu, Oct 26, 2023 at 03:42:34PM +0200, Hugo Valtier via curl-library wrote:
> Instead of using the Path Gateway it uses the Trustless Gateway which
> answers with a stream of blocks and walks the merkle-tree, verifies
> hashes and deserializes it on the fly.
> This would make curl or libcurl capable of downloading ipfs:// content
> from any reachable IPFS node, not just a localhost trusted one.

I'm far from an expert in IPFS, but my understanding was that there were two
main ways to get files over IPFS: one is to get them via HTTP from an IPFS
gateway that knows about IPFS (what curl does now) and the other is to become a
full-fledged node in the IPFS network and speak the IPFS protocols to the
world.  What you describe sounds like a third method, where one may somehow
find a full IPFS node that happens to have your file and talk a subset of the
IPFS protocol to get that file. Is that an accurate assessment? If so, is that
really a mode that would be used by a significant number of people?  How do you
find an appropriate node for each file, for example?
-- 
Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library
Etiquette:   https://curl.se/mail/etiquette.html


Re: Fwd: Adding IPFS Trustless Gateway Protocol Questions

2023-10-26 Thread Hugo Valtier via curl-library
> I know very little about IPFS but it certainly sounds like the right way.

Good, I'll change my code to be an improvement of the existing ipfs:// feature
in the CLI and I think we can continue discussion when I have a pull request.

Thx for your time.

> Would this not rather *replace* the URL rewrite approach? I mean, isn't
verifying the content always better than not verifying? And if not, how would
a user know when to use which?

Yeah sorry my explanation was confusing.
Currently the ipfs:// feature in curl, the IPFS decoding and hash validation
happens server side, instead I'll submit some code which does that in curl.
So a buggy or badly behaving server would be caught by curl and raise an error.

> One of my most important tasks in this project is to resist creeping featurism
> - and everyone who has been around here a while know that we are still adding
> features and changes at a rather high pace. To make sure that the line in the
> sand that is drawn between what is suitable for libcurl and what is NOT
> suitable for libcurl remains intact and untouched as much as possible. I think
> sticking "core" transport protocols could be considered one of those things
> that so far has defined if a protocol is meant for libcurl or not.
>
> Lines in sand are by definition not distinct or always very clear, but I think
we do everyone a service when we resist changing how that line is drawn.
>
> Of course, sometimes we end up clarifying or changing how we view the world
> after discussions and deliberating and then we adapt and move on with an
> updated view. But until that moment, we resist it.

Fair enough.

> I'm sorry but I don't understand the question. If IPFS-over-HTTP is not for
> libcurl, then how is Trustless-Gateway-over-HTTP-over-Libp2p any better?

I used it to poke holes in:
> If you want an IPFS-over-HTTP library you can make one on top of
> libcurl.

but as you pointed out:
> Lines in sand are by definition not distinct or always very clear, but I think
> we do everyone a service when we resist changing how that line is drawn.

so this is irrelevant.
I think the best path forward is for me to send a PR iterating on what is
already in the cli.

> Are you proposing that libcurl would use another library for IPFS that itself
> would have a *separate* HTTP implementation and this, because you're doing
> HTTP in a non-standard way?

I was saying that in case the suggestion was that by doing IPFS-over-HTTP in
curl wouldn't speak real IPFS protocol.
This is still experimental. I wasn't planning on this in libcurl any time soon.
An hypothetical implementation of this would use curl's http code, but running
over custom tcp semantics streams (which are multiplexed and encrypted).

This protocol's main goal is to allow us to use existing HTTP clients over our
mutually encrypted P2P network, which is really useful in browsers.
-- 
Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library
Etiquette:   https://curl.se/mail/etiquette.html


Re: Fwd: Adding IPFS Trustless Gateway Protocol Questions

2023-10-26 Thread Daniel Stenberg via curl-library

On Thu, 26 Oct 2023, Hugo Valtier via curl-library wrote:


This would make curl or libcurl capable of downloading ipfs:// content
from any reachable IPFS node, not just a localhost trusted one.
If anything, do we agree that doing this is a desirable improvement ?
assuming implemented in curl next to the current ipfs:// URL rewriting


I know very little about IPFS but it certainly sounds like the right way.

Would this not rather *replace* the URL rewrite approach? I mean, isn't 
verifying the content always better than not verifying? And if not, how would 
a user know when to use which?


For me the main draw of having it in libcurl is that we can provide the same 
GET semantics to consumers while having self validating incrementally 
verified requests.


Right, for application authors who want to add IPFS support next to some other 
protocols that libcurl already supports that seems like a decent benefit. I 
just don't see the user demand for that while at the same time it would add a 
significant cost and complexity to the library - plus the fact that 
"hierachically" it is not a clear-cut fit for libcurl.


If I make my own C library which consumes libcurl either I have a different 
API (creates work for consumers) or I make my own fork but I believe me 
tracking you is work overall than if libcurl would know how to reach out to 
the IPFS validation code, it's also annoying for distributions.


To provide IPFS-over-HTTP as a library, I would probably argue that doing this 
as a separate library that itself uses libcurl is the way forward. The API 
could even be designed to work as a libcurl companion perhaps rather than a 
straight layer on top.


I don't understand why precisely drawing the line at over-HTTP, Trustless 
Gateway is used in production to transfer data between IPFS implementations.


One of my most important tasks in this project is to resist creeping featurism 
- and everyone who has been around here a while know that we are still adding 
features and changes at a rather high pace. To make sure that the line in the 
sand that is drawn between what is suitable for libcurl and what is NOT 
suitable for libcurl remains intact and untouched as much as possible. I think 
sticking "core" transport protocols could be considered one of those things 
that so far has defined if a protocol is meant for libcurl or not.


Lines in sand are by definition not distinct or always very clear, but I think 
we do everyone a service when we resist changing how that line is drawn.


Of course, sometimes we end up clarifying or changing how we view the world 
after discussions and deliberating and then we adapt and move on with an 
updated view. But until that moment, we resist it.


Would the new Trustless-Gateway-over-HTTP-over-Libp2p protocol be a suitable 
candidate for libcurl ?


I'm sorry but I don't understand the question. If IPFS-over-HTTP is not for 
libcurl, then how is Trustless-Gateway-over-HTTP-over-Libp2p any better?


And why do HTTP-over-libp2p at all?

AFAIT it can't just be done on top since it requires running HTTP over yamux 
+ tls + tcp (or other libp2p transports).


Are you proposing that libcurl would use another library for IPFS that itself 
would have a *separate* HTTP implementation and this, because you're doing 
HTTP in a non-standard way? That sounds so crazy I must have misunderstood!


--

 / daniel.haxx.se
 | Commercial curl support up to 24x7 is available!
 | Private help, bug fixes, support, ports, new features
 | https://curl.se/support.html
--
Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library
Etiquette:   https://curl.se/mail/etiquette.html


Fwd: Adding IPFS Trustless Gateway Protocol Questions

2023-10-26 Thread Hugo Valtier via curl-library
> If we are still talking about IPFS-over-HTTP then I believe this situation
stands. If you want an IPFS-over-HTTP library you can make one on top of
libcurl.
>
> Am I wrong and if so, how and why?

Right now it is indeed "IPFS-over-HTTP" however it is different from
the current implementation.
Instead of using the Path Gateway it uses the Trustless Gateway which
answers with a stream of blocks and walks the merkle-tree, verifies
hashes and deserializes it on the fly.
This would make curl or libcurl capable of downloading ipfs:// content
from any reachable IPFS node, not just a localhost trusted one.
If anything, do we agree that doing this is a desirable improvement ?
assuming implemented in curl next to the current ipfs:// URL rewriting

> A primary reason I insisted on making the current implementation for the curl
tool and not for libcurl, is the fact that it is "only" a client on top of
> HTTP, and as such libcurl can be used as-is and does not need changing.

For me the main draw of having it in libcurl is that we can provide
the same GET semantics to consumers while having self validating
incrementally verified requests.
If I make my own C library which consumes libcurl either I have a
different API (creates work for consumers) or I make my own fork but I
believe me tracking you is work overall than if libcurl would know how
to reach out to the IPFS validation code, it's also annoying for
distributions.

I don't understand why precisely drawing the line at over-HTTP,
Trustless Gateway is used in production to transfer data between IPFS
implementations.
Would the new Trustless-Gateway-over-HTTP-over-Libp2p protocol be a
suitable candidate for libcurl ? AFAIT it can't just be done on top
since it requires running HTTP over yamux + tls + tcp (or other libp2p
transports).
-- 
Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library
Etiquette:   https://curl.se/mail/etiquette.html


Re: Adding IPFS Trustless Gateway Protocol Questions

2023-10-26 Thread Daniel Stenberg via curl-library

On Thu, 26 Oct 2023, Hugo Valtier via curl-library wrote:

Mark recently added ipfs:// support in the curl CLI however sadly it does 
not perform validation on the data received. I am interested in fixing that 
as well as moving the support into libcurl.


Hello and welcome to the curl project!

A primary reason I insisted on making the current implementation for the curl 
tool and not for libcurl, is the fact that it is "only" a client on top of 
HTTP, and as such libcurl can be used as-is and does not need changing.


If we are still talking about IPFS-over-HTTP then I believe this situation 
stands. If you want an IPFS-over-HTTP library you can make one on top of 
libcurl.


Am I wrong and if so, how and why?

--

 / daniel.haxx.se
 | Commercial curl support up to 24x7 is available!
 | Private help, bug fixes, support, ports, new features
 | https://curl.se/support.html
--
Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library
Etiquette:   https://curl.se/mail/etiquette.html