I beg your pardon if I made you angry Daniel, understating the "simplicity" of http!
You are not responding to the point about "control inversion" though. Let me try to better explain what happens to a "read" with the current behaviour. You know it since you have coded it yourself in "fcurl". Let's assume we have done all the bits to start the transfer, and now do fcurl_read(). At some point you arrive in function transfer() which calls curl_multi_perform() (at line 123) curl_multi_perform() will not return as long as there is data present (the "transfer loop"), and will repeatedly use the write_callback(), which must now store excess data. How much data do we talk about here? Well, I made the test on my (slow) laptop. Using curl_easy_rcv(), in plain http I have a lot of EGAIN, that is where the curl_multi_perform() will return because the socket is empty, hence there, the storage will be "reasonable" around a few kilobytes. But in https I get something like 500 total EGAIN on the 10GB file, that is because my PC is twice slower at doing TLS than the network, hence there is almost always "data in the pipe" for curl_multi_perform() to consume. That would mean allocating 20MB or more per transfer. In theory, on an even slower machine or some workload of the CPU, what the current fcurl_read() could end with is: put all the file in memory, then you serve the subsequent fcurl_read from memory! Performance aside, it is not an acceptable "risk" when writing a filesystem: you don't know how many concurrent files/streams can be opened at the same time. In case of "random read", you would have buffered megabyte of data you just don't need. To avoid that, the current solution I am using is running the curl transfers in separate threads, and blocking the callback with a semaphore when I have all the data I need to satisfy reads. That makes a complex code with thread communication, locking, atomic counters, etc... much prone to errors an bugs. That is because the "transfer loop", in the current architecture, is the property of libcurl. If we could be reversed the control of that loop, the "transfer loop" would now be the caller's property, and we don't have any need of doing separate threads to avoid a the drawback explained above. The control of how many bytes we want to pull out is now in the hands of the caller's transfer loop (plus a few more kilobytes in intermediate layers like TLS). Excess bytes stay in the socket's buffer (or the intermediate layers). Even not talking "performance", because indeed a "cache miss" in a network filesystem is much worse that some additional copy/alloc/semaphore-context-switch, I am looking at a major simplification of my code. As of today, the solutions I could thing about for that simplification: - using curl_easy_rcv() - writing my own transfer/socket code +/- with inspirations from other codes like GNU wget, which is very far from doing all what curl does (and does not have a "library") but can also be considered solid in what it does (http limited to 1.1 apparently). The right balance to get the control of the "transfer loop" back in my caller code, with all the simplifications it brings, seems to me the first option: curl_easy_rcv() Yes, I am aware I will then need to replicate some existing code (as http/1.1 is a known protocol), either from libcurl, or inspired from other code, but I hope the balance will still lean in the right direction. Summary keywords, if you want to react on that, could be: control inversion of the transfer loop. Sorry again. Cheers Alain
------------------------------------------------------------------- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.se/mail/etiquette.html