Hi, This proposal aims to solve an old problem. Make it easier to setup a mirror server for the official substitute server and prevent future complaints from China residents about network speed.
Due to the national firewall(i.e. The GFW) deployed on every backbone networks within China and it's aggressive rules on traffic passing through the wall, the internet connection from inner China to outside world is extremely unreliable and slow. This makes it a pain for newcomers to try and play with Guix, since guix daemon loves to spawn thousands of HTTP requests to the substitute server and download thousands of megabytes of packages from the substitute server. Usually it takes a whole day to install Guix system on a new computer. I am serious. I have been a Guix user for several years and actually I am not very much bothered by the problem. But since V1.0 release of Guix, Guix is frequently mentioned in tech news. When tech savvy boys hear the cool Guix and try to install it on their computer and still cannot complete the installation process after half a day, they will always be pissed off by the slow network speed and complain on internet. I mentioned Guix several times on some websites, this makes me the first Chinese Guix user they can find with Google and ask for help. I am sure maintainers here know these kind of complaints. Do you feel it is strange that no Chinese user make new complaints in year 2020 while the Guix project is quickly progressing and becoming more popular? That is because I decided to setup a mirror server for Chinese Guix users after receiving several complaints online and realizing that lobbying academic FLOSS mirror maintainers to support Guix will be stagnant for at least several years due to these maintainers' laziness and cowardliness and some ridiculous strict governmental regulations. These people only want to add mirror for a project if it is as simple as pulling static files with a cron job (usually with rsync) and serving static files through a simple HTTP server. HTTP reverse proxy is not an option. So my mirror.guix.org.cn project was started. It's an HTTP cache mirror of ci.guix.gnu.org, plus a git mirror of https://git.savannah.gnu.org/cgit/guix.git. Yes, the connection to Savannah is also extremely slow and it makes `guix pull` unusable. This mirror server started as an experiment and it has been working well. If random new user come to me and say `guix pull` is so slow or `guix install` is so slow, I simply tell them to use mirror.guix.org.cn. The number of active Chinese Guix users I know has increased from two to about ten after someone's broadcast in a news group. It is basically: "Look. There is thing called Guix. Someone has setup a mirror server for it in China." The traffic on the server is increasing. Network connections are stable. Everything is fine in this year. However one thing worries me. The bandwidth of mirror.guix.org.cn is only 5Mbps (still far more better than ci.guix.gnu.org's 30KB/s and constant connection reset). This is the highest bandwidth I can afford because internet bandwidth in China is damn too expensive. Buying higher bandwidth is not a financially possible approach for me. This is not a problem in the short term but definitely be a problem in the long term. Persuading academic FLOSS mirror maintainers to support Guix is still the best solution for Chinese users. Academic organizations usually have 100Mbps bandwidth and tens of terabytes of disk. Now, finally, we are onto the main point. I look into guix publish's cache directory and think that nar and narinfo files can be directly served through a static HTTP server if we make those files' URL identical to their on-disk file name. The current directory structure is like this: --8<---------------cut here---------------start------------->8--- /var/cache/guix/publish ├── gzip │ ├── 87kif0bpf0anwbsaw0jvg8fyciw4sz67-bash-5.0.16.nar │ ├── 87kif0bpf0anwbsaw0jvg8fyciw4sz67-bash-5.0.16.narinfo │ ├── fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31.nar │ └── fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31.narinfo ├── hashes │ ├── 87kif0bpf0anwbsaw0jvg8fyciw4sz67 │ └── fa6wj5bxkj5ll1d7292a70knmyl7a0cr ├── last-expiry-cleanup └── lzip ├── 87kif0bpf0anwbsaw0jvg8fyciw4sz67-bash-5.0.16.nar ├── 87kif0bpf0anwbsaw0jvg8fyciw4sz67-bash-5.0.16.narinfo ├── fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31.nar └── fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31.narinfo --8<---------------cut here---------------end--------------->8--- --8<---------------cut here---------------start------------->8--- > md5sum > /var/cache/guix/publish/*/87kif0bpf0anwbsaw0jvg8fyciw4sz67-bash-5.0.16.narinfo 29cdbf041b9a304bf58f2e75ec23f18f /var/cache/guix/publish/gzip/87kif0bpf0anwbsaw0jvg8fyciw4sz67-bash-5.0.16.narinfo 29cdbf041b9a304bf58f2e75ec23f18f /var/cache/guix/publish/lzip/87kif0bpf0anwbsaw0jvg8fyciw4sz67-bash-5.0.16.narinfo --8<---------------cut here---------------end--------------->8--- When a client tries to download /gnu/store/87kif0bpf0anwbsaw0jvg8fyciw4sz67-bash-5.0.16, it sends a request to http://example.com/87kif0bpf0anwbsaw0jvg8fyciw4sz67.narinfo and gets the content of /var/cache/guix/publish/gzip/87kif0bpf0anwbsaw0jvg8fyciw4sz67-bash-5.0.16.narinfo: --8<---------------cut here---------------start------------->8--- StorePath: /gnu/store/87kif0bpf0anwbsaw0jvg8fyciw4sz67-bash-5.0.16 URL: nar/gzip/87kif0bpf0anwbsaw0jvg8fyciw4sz67-bash-5.0.16 Compression: gzip FileSize: 2284657 URL: nar/lzip/87kif0bpf0anwbsaw0jvg8fyciw4sz67-bash-5.0.16 Compression: lzip FileSize: 1256260 NarHash: sha256:1ap2s3xz3bbp5n78v826gxagy7pic1wpgzz3ka72jdyk6qpmw3qr NarSize: 6597040 References: ... System: x86_64-linux Deriver: cccyyn4xq59aimybmhlrfl2bi8kslhlm-bash-5.0.16.drv Signature: ... --8<---------------cut here---------------end--------------->8--- Client then sends a request to the URL as written in the URL field: http://example.com/nar/lzip/87kif0bpf0anwbsaw0jvg8fyciw4sz67-bash-5.0.16. This URL returns the file /var/cache/guix/publish/lzip/87kif0bpf0anwbsaw0jvg8fyciw4sz67-bash-5.0.16.nar. I propose we make the URL field in narinfo the same as nar file name on disk. We can change the directory structure to: --8<---------------cut here---------------start------------->8--- /var/cache/guix/publish/nar ├── 87kif0bpf0anwbsaw0jvg8fyciw4sz67.narinfo ├── 87kif0bpf0anwbsaw0jvg8fyciw4sz67-bash-5.0.16.nar.gz └── 87kif0bpf0anwbsaw0jvg8fyciw4sz67-bash-5.0.16.nar.lz --8<---------------cut here---------------end--------------->8--- And change the URL field in narinfo to --8<---------------cut here---------------start------------->8--- URL: 87kif0bpf0anwbsaw0jvg8fyciw4sz67-bash-5.0.16.nar.lz --8<---------------cut here---------------end--------------->8--- Then a mirror site can simply pull the directory /var/cache/guix/publish/nar from the Berlin server and serve this directory through a static HTTP server. There will be cache misses. But guix-daemon will safely fallback to the next server in substitute-urls. What's your opinion? I have to decide next year's server specs and budget for mirror.guix.org.cn before the Chinese shopping festival ends on November 11. If the proposal above is doable, I will keep mirror.guix.org.cn running for half a year and help academic mirror sites add support for Guix in the meantime. Otherwise I prefer to buy a prepaid three years VPS with a 90% discount during the shopping festival. The discount is huge. I don't want to miss it. -- Peng Mei Yu