Re: Proposal for a blog contribution on reproducible computations
Hi, I had missed that message. Konrad Hinsen skribis: > Ludovic Courtès writes: > >> Another thing that comes to mind: would it make sense to mention ‘guix >> graph’ in the part where you pipe the output of ‘guix show’ to ‘recsel’, >> etc.? > > Forgot that one, sorry. Yes, it would make sense, though I'd place it a > bit later in the text. But I'd have to figure out first how how the > various options of "guix graph" relate exactly to what I am writing. > > ‘package’ >This is the default type used in the example above. It shows the >DAG of package objects, excluding implicit dependencies. It is >concise, but filters out many details. > > Are "implicit dependencies" those added by the build system? If yes, > this edges in this graph would correspond to package-direct-inputs. Exactly. > ‘bag’ >Similar to ‘bag-emerged’, but this time including all the bootstrap >dependencies. > > And that is package-closure with arrows defined by bag-direct-inputs, right? Yes. Thanks, Ludo’.
Re: Proposal for a blog contribution on reproducible computations
Konrad Hinsen skribis: >> Perfect, thanks! I had to move files where Haunt expects them, and to >> make the table pure ASCII because guile-commonmark doesn’t support HTML >> nor tables, but here we are: >> >> https://guix.gnu.org/blog/2020/reproducible-computations-with-guix/ >> https://hpc.guix.info/blog/2020/01/reproducible-computations-with-guix/ > > Looks good, thanks! > > One problem though: the link to the script is broken: > > > https://guix.gnu.org/blog/2020/reproducible-computations-with-guix/show-dependencies.scm > > yield 404. Oops, sorry about that! Should be fixed now. > I had put it into the same directory as the Markdown file and the > images, apparently Haunt didn't like that either. Yeah, it has to be in website/static. Thanks, Ludo’.
Re: Proposal for a blog contribution on reproducible computations
Hi Ludo, > Perfect, thanks! I had to move files where Haunt expects them, and to > make the table pure ASCII because guile-commonmark doesn’t support HTML > nor tables, but here we are: > > https://guix.gnu.org/blog/2020/reproducible-computations-with-guix/ > https://hpc.guix.info/blog/2020/01/reproducible-computations-with-guix/ Looks good, thanks! One problem though: the link to the script is broken: https://guix.gnu.org/blog/2020/reproducible-computations-with-guix/show-dependencies.scm yield 404. I had put it into the same directory as the Markdown file and the images, apparently Haunt didn't like that either. Cheers, Konrad.
Re: Proposal for a blog contribution on reproducible computations
Hello! Konrad Hinsen skribis: >> You can post a patch against the guix-artwork.git repo here when you’re >> ready. > > Here it comes ! Perfect, thanks! I had to move files where Haunt expects them, and to make the table pure ASCII because guile-commonmark doesn’t support HTML nor tables, but here we are: https://guix.gnu.org/blog/2020/reproducible-computations-with-guix/ https://hpc.guix.info/blog/2020/01/reproducible-computations-with-guix/ Spread the word! Thank you, Konrad. Ludo’.
Re: Proposal for a blog contribution on reproducible computations
Hello! Konrad Hinsen skribis: >> Minor comments: >> >> • You write “Build systems are packages as well”. This could be >> slightly misleading: build systems are (1) a set of packages, and >> (2) a build procedure. Dunno if it makes sense to clarify that. > > Maybe I got something wrong, but I think I described this as you say > (please check!). Quote: > > Build systems are pieces of Guile code that are part of Guix. But this > Guile code is only a shallow layer orchestrating invocations of other > software, such as =gcc= or =make=. And that software is defined by > packages. > > The build procedure is that "shallow layer orchestrating invocations". > Does this sound right? Oh yes, that’s entirely correct! It’s just the section title that I thought could be misleading, but maybe not given this explanation. >> • Regarding ‘--container’, you write that namespaces “may not be >> present on your system, or may be disabled by default”, which is a >> bit strong; “may be present on your system, but perhaps disabled by >> default” would be more accurate. :-) > > Fixed. I don't know anything about the implementation techniques of > –container, so I'll blindly write what you say :-) It relies on “unprivileged user namespaces”, a Linux feature that’s been around for some time, and is almost always compiled in, but is disabled by default on some major distros. Thanks! Ludo’.
Re: Proposal for a blog contribution on reproducible computations
Hi Konrad Konrad Hinsen writes: [...] > Maybe I got something wrong, but I think I described this as you say > (please check!). Quote: > > Build systems are pieces of Guile code that are part of Guix. But this > Guile code is only a shallow layer orchestrating invocations of other > software, such as =gcc= or =make=. And that software is defined by > packages. > > The build procedure is that "shallow layer orchestrating invocations". > Does this sound right? what about "shallow layer (build procedure) orchestrating invocations", just to be super-clear? :-) [...] > Giovanni Biscuolo writes: > >>> (which is sad because your Org file with Babel sessions is much nicer…). >>> I think Pierre had something to convert Org to Markdown. >> >> you could try pandoc or emacs-ox-hugo, both in Guix >> I can help convert/adapt if needed > > My plan for now is pandoc, but if that doesn't work as expected, I'll > come back to your offer for help! I seldom used pandoc to converd org to Markdown, please give me a feedback on the quality of the result, thanks! [...] Ciao, Gio' -- Giovanni Biscuolo Xelera IT Infrastructures signature.asc Description: PGP signature
Re: Proposal for a blog contribution on reproducible computations
Ludovic Courtès writes: > Another thing that comes to mind: would it make sense to mention ‘guix > graph’ in the part where you pipe the output of ‘guix show’ to ‘recsel’, > etc.? Forgot that one, sorry. Yes, it would make sense, though I'd place it a bit later in the text. But I'd have to figure out first how how the various options of "guix graph" relate exactly to what I am writing. ‘package’ This is the default type used in the example above. It shows the DAG of package objects, excluding implicit dependencies. It is concise, but filters out many details. Are "implicit dependencies" those added by the build system? If yes, this edges in this graph would correspond to package-direct-inputs. ‘bag’ Similar to ‘bag-emerged’, but this time including all the bootstrap dependencies. And that is package-closure with arrows defined by bag-direct-inputs, right? Cheers, Konrad.
Re: Proposal for a blog contribution on reproducible computations
Hi Ludo, Simon, and GIovanni, Thanks for your feedback ! > Minor comments: > > • You write “Build systems are packages as well”. This could be > slightly misleading: build systems are (1) a set of packages, and > (2) a build procedure. Dunno if it makes sense to clarify that. Maybe I got something wrong, but I think I described this as you say (please check!). Quote: Build systems are pieces of Guile code that are part of Guix. But this Guile code is only a shallow layer orchestrating invocations of other software, such as =gcc= or =make=. And that software is defined by packages. The build procedure is that "shallow layer orchestrating invocations". Does this sound right? > • In the ‘guix pack’ example, you could perhaps omit all the -S flags > except for /bin, and mention ‘--save-provenance’. I'll have to look up ‘--save-provenance’ first. I don't use "guix pack" that much, though I should probably use it more, if only to expose more people indirectly to Guix. > • Would it make sense to mention MPFR in the paragraph about IEEE 754? I considered it, but left it out because it would probably create confusion. And people who are aware of MPFR probably don't need my explanation of floats. > • Regarding ‘--container’, you write that namespaces “may not be > present on your system, or may be disabled by default”, which is a > bit strong; “may be present on your system, but perhaps disabled by > default” would be more accurate. :-) Fixed. I don't know anything about the implementation techniques of –container, so I'll blindly write what you say :-) > The format we use is Markdown fed to Haunt: OK, pandoc should get me there. > You can post a patch against the guix-artwork.git repo here when you’re > ready. OK. > If you want we can publish it next Tuesday or Thursday. We could have > it on both hpc.guix.info and guix.gnu.org, with one saying that it’s a > re-post of the other. Fine with me! zimoun writes: > That said, I also find interesting the command-line and hashes comparisons: > > --8<---cut here---start->8--- > /usr/bin/gcc pi.c -o pi-debian-gcc8 > docker run -v `pwd`:`pwd` -w `pwd` -ti gcc-toolchain gcc pi.c -o pi-docker > guix environment --container --ad-hoc gcc-toolchain -- gcc pi.c -o pi-guix > > md5sum pi-* > > b268af34d62763a2a707944403bf7b0b pi-debian-gcc8 > 1be3c1b5d1e065017e4c56f725b1a692 pi-docker > 1be3c1b5d1e065017e4c56f725b1a692 pi-guix > --8<---cut here---end--->8--- > > Anyway! :-) Nice! Not sure I want to go into that because it requires adding another system (Debian), which I think is mainly a source of confusion. >> • Would it make sense to mention MPFR in the paragraph about IEEE 754? > > And MPFI? ;-) OK, I see another blog post coming ;-) But there are people more competent than myself for that. Giovanni Biscuolo writes: >> (which is sad because your Org file with Babel sessions is much nicer…). >> I think Pierre had something to convert Org to Markdown. > > you could try pandoc or emacs-ox-hugo, both in Guix > I can help convert/adapt if needed My plan for now is pandoc, but if that doesn't work as expected, I'll come back to your offer for help! Thanks everyone, Konrad.
Re: Proposal for a blog contribution on reproducible computations
Hello, kudos for the great article! Ludovic Courtès writes: [...] > The format we use is Markdown fed to Haunt: > > https://git.savannah.gnu.org/cgit/guix/guix-artwork.git/tree/website/posts > > (which is sad because your Org file with Babel sessions is much nicer…). > I think Pierre had something to convert Org to Markdown. you could try pandoc or emacs-ox-hugo, both in Guix I can help convert/adapt if needed HTH! [...] -- Giovanni Biscuolo Xelera IT Infrastructures signature.asc Description: PGP signature
Re: Proposal for a blog contribution on reproducible computations
Hi Ludo, On Fri, 10 Jan 2020 at 17:59, Ludovic Courtès wrote: > • In the ‘guix pack’ example, you could perhaps omit all the -S flags > except for /bin, and mention ‘--save-provenance’. I am the culprit. The invocation of "guix pack -f docker" is not clear to me. So basically, I copied/pasted the lines here [1] :-) because it works all the time. [1] http://bioinformatics.mdc-berlin.de/pigx/supplementary-materials.html That said, I also find interesting the command-line and hashes comparisons: --8<---cut here---start->8--- /usr/bin/gcc pi.c -o pi-debian-gcc8 docker run -v `pwd`:`pwd` -w `pwd` -ti gcc-toolchain gcc pi.c -o pi-docker guix environment --container --ad-hoc gcc-toolchain -- gcc pi.c -o pi-guix md5sum pi-* b268af34d62763a2a707944403bf7b0b pi-debian-gcc8 1be3c1b5d1e065017e4c56f725b1a692 pi-docker 1be3c1b5d1e065017e4c56f725b1a692 pi-guix --8<---cut here---end--->8--- Anyway! :-) > • Would it make sense to mention MPFR in the paragraph about IEEE 754? And MPFI? ;-) All the best, simon
Re: Proposal for a blog contribution on reproducible computations
Another thing that comes to mind: would it make sense to mention ‘guix graph’ in the part where you pipe the output of ‘guix show’ to ‘recsel’, etc.? Ludo’.
Re: Proposal for a blog contribution on reproducible computations
Hi Konrad, Konrad Hinsen skribis: > Here is a first complete draft: > > > https://github.com/khinsen/reproducibility-with-guix/blob/master/reproducibility-with-guix.org > > Feedback welcome, be it by mail or as issues on GitHub. I’ve read it entirely and I think it’s perfect. It’s a pleasant read, it covers many aspects in a pedagogical way (if I’m able to judge that!), and it always shows how these nitty-gritty details relate to reproducible computations. I like how you explain that it’s human interpretation that leads us to split “inputs” and “outputs” into more specific categories (I had already enjoyed that in one of your talks). Minor comments: • You write “Build systems are packages as well”. This could be slightly misleading: build systems are (1) a set of packages, and (2) a build procedure. Dunno if it makes sense to clarify that. • In the ‘guix pack’ example, you could perhaps omit all the -S flags except for /bin, and mention ‘--save-provenance’. • Would it make sense to mention MPFR in the paragraph about IEEE 754? • Regarding ‘--container’, you write that namespaces “may not be present on your system, or may be disabled by default”, which is a bit strong; “may be present on your system, but perhaps disabled by default” would be more accurate. :-) > Also, what is the procedure for submitting blog posts? What are the > right formats for text and graphics? The format we use is Markdown fed to Haunt: https://git.savannah.gnu.org/cgit/guix/guix-artwork.git/tree/website/posts (which is sad because your Org file with Babel sessions is much nicer…). I think Pierre had something to convert Org to Markdown. To syntax-highlight Scheme code, you must start Scheme blocks with “```scheme” in Markdown. PNGs for graphics are good. You can post a patch against the guix-artwork.git repo here when you’re ready. If you want we can publish it next Tuesday or Thursday. We could have it on both hpc.guix.info and guix.gnu.org, with one saying that it’s a re-post of the other. Thank you for the great article! Ludo’.
Re: Proposal for a blog contribution on reproducible computations
Hi Konrad, Thank you! It is very interesting!! Below questions. And suggestions which I can Pull-Request with Github. :-) Hope it is readable: indented text is your text; non-indented one is question. Cheers, simon -- #+TITLE: Reproducible computations with Guix #+STARTUP: inlineimages * Dependencies: what it takes to run a program Move this section title below. This post is about reproducible computations, so let's start with a computation. A short, though rather uninteresting, C program is a good starting point. It computes π in three different ways: #+begin_src c :tangle pi.c :eval no #include #include int main() { printf( "M_PI : %.10lf\n", M_PI); printf( "4 * atan(1.) : %.10lf\n", 4.*atan(1.)); printf( "Leibniz' formula (four terms): %.10lf\n", 4.*(1.-1./3.+1./5.-1./7.)); return 0; } #+end_src Align ':' for easier looking. This program uses no random element, such as a random number generator or parallelism. It's strictly deterministic. It is reasonable to expect it to produce exactly the same output, on any computer and at any point in time. And yet, many programs whose results /should/ be perfectly reproducible are in fact not. Programs using floating-point arithmetic, such as this short example, are particularly prone to seemingly inexplicable variations. My goal is to explain why deterministic programs often fail to be reproducible, and what it takes to fix this. The short answer to that question is "use Guix", but even though Guix provides excellent support for reproducibility, you still have to use it correctly, and that requires some understanding of what's going on. The explanation I will give is rather detailed, to the point of discussing parts of the Guile API of Guix. You should be able to follow the reasoning without knowing Guile though, you will just have to believe me that the scripts I will show do what I claim they do. And in the end, I will provide a ready-to-run Guile script that will let you explore package dependencies right from the shell. * Dependencies: what it takes to run a program One keyword in discussions of reproducibility is "dependencies". I will revisit the exact meaning of this term later, but to get started, I will define it loosely as "any software package required to run a program". Running the π computation shown above is normally done using something like #+begin_src sh :exports code :eval no gcc pi.c -o pi && ./pi #+end_src Missing '&&'. It does not work without on my machine. C programmers know that =gcc= is a C compiler, so that's one obvious dependency for running our little program. But is a C compiler enough? That question is surprisingly difficult to answer in practice. Your computer is loaded with tons of software (otherwise it wouldn't be very useful), and you don't really know what happens behind the scenes when you run =gcc= or =pi=. ** Container is good A major element of reproducibility support in Guix is the possibility to run programs in well-defined environments that contain exactly the software packages you request, and no more. So if your program runs in an environment that contains only a C compiler, you can be sure it has no other dependencies. Let's create such an environment: #+begin_src sh :session C-compiler :results output :exports both guix environment --container --ad-hoc gcc-toolchain #+end_src #+RESULTS: The option =--container= ensures the best possible isolation from the standard environment that your system installation and user account provide for day-to-day work. This environment contains nothing but a C compiler and a shell (which you need to type in commands), and has access to no other files than those in the current directory. Side note: the option =--container= requires support from the Linux kernel that is not available on all systems. If it doesn't work for you, use =--pure= instead. It provides a less isolated environment, but it is usually more than good enough. By default, I get: --8<---cut here---start->8--- guix environment: error: cannot create container: unprivileged user cannot create user namespaces guix environment: error: please set /proc/sys/kernel/unprivileged_userns_clone to "1" --8<---cut here---end--->8--- Or a sentence explaining what to do. For example, "The =--container= option requires allowing the kernel to clone for the unprivileged user, i.e., as =root= just run the command =echo 1 > /proc/sys/ke